Skip to content

Recipes Configuration

The recipes module provides YAML-driven configuration for running multiple metrics as batches. This is the recommended approach for:

  • Batch processing of multiple metrics
  • Standardized metric configurations
  • Production environments with consistent setups

Overview

The recipes system allows you to define metrics and datasets in YAML files, enabling declarative configuration of statistical computations. Each metric configuration supports fan-out expansion, where lists of values automatically expand into multiple metric instances.

Core Concepts

Fan-out Expansion

When you provide lists for certain fields (like name and segment), the system automatically expands them into multiple metric configurations. All fan-out fields must have the same length to ensure proper pairing.

Basic Structure

datasets:
  dataset_name:
    type: csv
    source: path/to/data.csv
collections:
  metric_id:
    metrics:
    - name:
      - metric1
      - metric2
      segment:
      - - segment1
      - - segment2
      metric_type: metric_name
      data_format: record
    dataset: dataset_name

Segment Configuration

Segments define how to group your data for analysis:

  • Use null for no segmentation on a particular metric
  • Use ["column_name"] for single-column segmentation
  • Use ["col1", "col2"] for multi-column segmentation
  • When using fan-out, each segment entry corresponds to one metric

Example:

collections:
  segmentation_example:
    dataset: dataset_name
    metrics:
      - metric_type: default_accuracy
        data_format: record
        name: ["total_metric", "segmented_metric", "multi_segment_metric"]
        segment: [null, ["product_type"], ["product_type", "region"]]
        prob_def: probability
        default: default_flag

Available Metrics

The following metric types are supported:

  • default_accuracy: Default prediction accuracy validation
  • ead_accuracy: Exposure at Default accuracy validation
  • hosmer_lemeshow: Hosmer-Lemeshow goodness-of-fit test
  • jeffreys_test: Jeffreys Bayesian calibration test
  • mape: Mean Absolute Percentage Error for scale-independent prediction accuracy
  • rmse: Root Mean Squared Error for prediction accuracy assessment
  • auc: Area Under the ROC Curve discrimination metric
  • gini: Gini coefficient discrimination metric
  • population_stability_index: Population Stability Index for distribution shift monitoring
  • mean: Mean summary statistic
  • median: Median summary statistic

Most metrics support both record and summary data formats; mean/median are record-level only.

Getting Started

  1. Configuration Overview - Learn about YAML structure and fan-out expansion
  2. Complete Examples - Full recipe configurations and patterns

Metric Documentation

Accuracy Metrics

  • Default Accuracy - Binary classification accuracy
  • EAD Accuracy - Exposure at Default accuracy with confidence intervals
  • MAPE - Mean Absolute Percentage Error for scale-independent accuracy
  • RMSE - Root Mean Squared Error for continuous prediction accuracy

Statistical Tests

Stability Metrics

Summary Statistics

  • Mean - Arithmetic mean calculation with segmentation
  • Median - Robust central tendency with quartiles

Each metric documentation includes configuration fields, output columns, data requirements, fan-out examples, and usage notes.

recipes

YAML recipe interface for batch metric execution.

This module provides the YAML recipe approach for using the TNP statistic library. Define metric collections in YAML files and execute them as batches.

Example usage
from tnp_statistic_library.recipes import load_configuration_from_yaml

config = load_configuration_from_yaml("my_metrics.yaml")
results = config.collections.run()
df = results.to_dataframe()

load_configuration_from_yaml

load_configuration_from_yaml(
    yaml_file: str | Path,
) -> Configuration

Load configuration from a YAML file.

Parameters:

Name Type Description Default
yaml_file str | Path

Path to YAML file or raw YAML string

required

Returns:

Type Description
Configuration

Configuration object that can be used to collect metrics

Example
config = load_configuration_from_yaml("metrics.yaml")
results = config.collections.run()

options: show_source: false heading_level: 2 members_order: source