Skip to content

Mean Summary

The Mean Summary metric calculates arithmetic mean values for specified variables, with optional segmentation.

Configuration Fields

Required Fields

  • name (string or list): Metric identifier(s)
  • dataset (string or list): Dataset identifier(s) to analyze
  • type: Must be "mean"
  • variable (string or list): variable_column name(s) for mean calculation

Optional Fields

  • segment (string or list): Segment identifier(s) for grouping analysis

Output Columns

The Mean Summary produces these output columns:

  • Standard identification columns (name, dataset, segment)
  • mean: The calculated arithmetic mean value

Data Requirements

  • Data must contain the specified variable_column(s) with numeric values
  • Missing/null values are excluded from calculation
  • If segments are specified, data must contain the segment column(s)

Fan-out Examples

Basic Configuration

- name: average_loan_amount
  type: mean
  dataset: loan_data
  variable: loan_amount

Multiple Variables

- name: [avg_income, avg_score, avg_debt]
  type: mean
  dataset: customer_data
  variable: [annual_income, credit_score, total_debt]

This expands to:

  • avg_income calculating mean of annual_income
  • avg_score calculating mean of credit_score
  • avg_debt calculating mean of total_debt

Segmented Analysis

- name: regional_income_avg
  type: mean
  dataset: customer_data
  variable: annual_income
  segment: [north, south, east, west]

This creates separate mean calculations for each region.

Multiple Datasets and Variables

- name: [q1_revenue, q2_revenue]
  type: mean
  dataset: [q1_sales, q2_sales]
  variable: [revenue, revenue]

Complex Multi-dimensional Fan-out

- name: [income_north, income_south, score_north, score_south]
  type: mean
  dataset: customer_data
  variable: [annual_income, annual_income, credit_score, credit_score]
  segment: [north, south, north, south]

Usage Notes

  • Numeric Data: Variable must contain numeric data types
  • Missing Values: Automatically excluded from mean calculation
  • Zero Values: Included in calculation unless explicitly filtered in data
  • Segmentation: Each segment produces separate mean calculation
  • Large Datasets: Efficient calculation even with millions of records

Fan-out Expansion Rules

When using lists in configuration:

  • name, dataset, variable must have matching lengths when specified as lists
  • segment can be a single value (applied to all) or list matching other field lengths
  • Each combination creates a separate metric calculation
  • All metrics of this type will have the same output column structure

Statistical Notes

  • Arithmetic Mean: Simple average of all values
  • Outlier Sensitivity: Mean can be affected by extreme values; consider median for robust central tendency