Skip to content

Default Accuracy Metric

The default_accuracy metric evaluates the accuracy of predicted default probabilities by comparing them to observed default rates.

Metric Type: default_accuracy

Accuracy Calculation

The accuracy is calculated as: pd / odr (expected to observed ratio)

Where:

  • pd = Predicted Default probability (mean)
  • odr = Observed Default Rate (actual defaults / total volume)

Configuration Fields

Record-Level Data Format

For individual loan/account records:

collections:
  accuracy_check:
    metrics:
    - name:
      - portfolio_accuracy
      data_format: record
      prob_def: predicted_probability
      default: default_flag
      segment:
      - - product_type
      metric_type: default_accuracy
    dataset: loan_portfolio

Summary-Level Data Format

For pre-aggregated data:

collections:
  summary_accuracy:
    metrics:
    - name:
      - aggregated_accuracy
      data_format: summary
      mean_pd: avg_probability
      defaults: default_count
      volume: total_count
      segment:
      - - risk_grade
      metric_type: default_accuracy
    dataset: risk_summary

Required Fields by Format

Record-Level Required

  • name: Metric name(s)
  • data_format: Must be "record"
  • prob_def: Probability column name
  • default: Default indicator column name
  • dataset: Dataset reference

Summary-Level Required

  • name: Metric name(s)
  • data_format: Must be "summary"
  • mean_pd: Mean probability column name
  • defaults: Default count column name
  • volume: Volume count column name
  • dataset: Dataset reference

Optional Fields

  • segment: List of column names for grouping

Output Columns

The metric produces the following output columns:

  • group_key: Segmentation group identifier (struct of segment values)
  • volume: Total number of observations
  • defaults: Total number of defaults
  • odr: Observed Default Rate (Defaults/Volume)
  • pd: Mean Predicted Default probability
  • accuracy: Calculated accuracy score

Fan-out Examples

Multiple Accuracy Metrics

collections:
  portfolio_accuracy:
    metrics:
    - name:
      - total_accuracy
      - product_accuracy
      - region_accuracy
      segment:
      - null
      - - product_type
      - - region
      data_format: record
      prob_def: model_score
      default: default_indicator
      metric_type: default_accuracy
    dataset: loan_data

This creates three accuracy metrics:

  1. Overall portfolio accuracy
  2. Accuracy by product type
  3. Accuracy by region

Mixed Data Formats

collections:
  detailed_accuracy:
    metrics:
    - name:
      - record_accuracy
      data_format: record
      prob_def: probability
      default: default
      metric_type: default_accuracy
    dataset: detailed_data
  summary_accuracy:
    metrics:
    - name:
      - summary_accuracy
      data_format: summary
      mean_pd: mean_prob
      defaults: def_count
      volume: vol_count
      metric_type: default_accuracy
    dataset: summary_data

Data Requirements

Record-Level Data

  • One row per loan/account
  • Probability column: numeric values between 0.0 and 1.0
  • Default column: binary values (0/1 or boolean)

Summary-Level Data

  • One row per group/segment
  • Mean probability: numeric values between 0.0 and 1.0
  • Default counts: positive numbers or None (negative values not allowed)
  • Volume counts: positive numbers or None (negative values not allowed)