Mean Summary¶
The Mean Summary metric calculates arithmetic mean values for specified variables, with optional segmentation.
Configuration Fields¶
Required Fields¶
name(string or list): Metric identifier(s)dataset(string or list): Dataset identifier(s) to analyzetype: Must be"mean"variable(string or list): variable_column name(s) for mean calculation
Optional Fields¶
segment(string or list): Segment identifier(s) for grouping analysis
Output Columns¶
The Mean Summary produces these output columns:
- Standard identification columns (
name,dataset,segment) mean: The calculated arithmetic mean value
Data Requirements¶
- Data must contain the specified variable_column(s) with numeric values
- Missing/null values are excluded from calculation
- If segments are specified, data must contain the segment column(s)
Fan-out Examples¶
Basic Configuration¶
Multiple Variables¶
- name: [avg_income, avg_score, avg_debt]
type: mean
dataset: customer_data
variable: [annual_income, credit_score, total_debt]
This expands to:
avg_incomecalculating mean ofannual_incomeavg_scorecalculating mean ofcredit_scoreavg_debtcalculating mean oftotal_debt
Segmented Analysis¶
- name: regional_income_avg
type: mean
dataset: customer_data
variable: annual_income
segment: [north, south, east, west]
This creates separate mean calculations for each region.
Multiple Datasets and Variables¶
- name: [q1_revenue, q2_revenue]
type: mean
dataset: [q1_sales, q2_sales]
variable: [revenue, revenue]
Complex Multi-dimensional Fan-out¶
- name: [income_north, income_south, score_north, score_south]
type: mean
dataset: customer_data
variable: [annual_income, annual_income, credit_score, credit_score]
segment: [north, south, north, south]
Usage Notes¶
- Numeric Data: Variable must contain numeric data types
- Missing Values: Automatically excluded from mean calculation
- Zero Values: Included in calculation unless explicitly filtered in data
- Segmentation: Each segment produces separate mean calculation
- Large Datasets: Efficient calculation even with millions of records
Fan-out Expansion Rules¶
When using lists in configuration:
name,dataset,variablemust have matching lengths when specified as listssegmentcan be a single value (applied to all) or list matching other field lengths- Each combination creates a separate metric calculation
- All metrics of this type will have the same output column structure
Statistical Notes¶
- Arithmetic Mean: Simple average of all values
- Outlier Sensitivity: Mean can be affected by extreme values; consider median for robust central tendency