Skip to content

AUC (Area Under Curve) Metric

The auc metric calculates the Area Under the ROC Curve, measuring a model's ability to discriminate between defaults and non-defaults.

Metric Type: auc

AUC Calculation

AUC measures discrimination power by calculating the area under the Receiver Operating Characteristic (ROC) curve:

  • 1.0: Perfect discrimination
  • 0.5: No discrimination (random)
  • 0.0: Perfect inverse discrimination

Configuration Fields

Record-Level Data Format

For individual loan/account records:

metrics:
  discrimination_analysis:
    metric_type: "auc"
    config:
      name: ["model_discrimination"]
      data_format: "record_level"
      prob_def: "model_score" # Column with predicted probabilities (0.0-1.0)
      default: "default_flag" # Column with default indicators (0/1 or boolean)
      segment: [["product_type"]] # Optional: segmentation columns
      dataset: "loan_portfolio"

Summary-Level Data Format

For pre-aggregated risk-ordered data:

metrics:
  summary_discrimination:
    metric_type: "auc"
    config:
      name: ["risk_grade_auc"]
      data_format: "summary_level"
      mean_pd: "avg_probability" # Column with mean probabilities (for ordering)
      defaults: "default_count" # Column with default counts
      volume: "total_count" # Column with total observation counts
      segment: [["model_version"]] # Optional: segmentation columns
      dataset: "risk_grade_summary"

Required Fields by Format

Record-Level Required

  • name: Metric name(s)
  • data_format: Must be "record_level"
  • prob_def: Probability column name
  • default: Default indicator column name
  • dataset: Dataset reference

Summary-Level Required

  • name: Metric name(s)
  • data_format: Must be "summary_level"
  • mean_pd: Mean probability column name (used for risk ordering)
  • defaults: Default count column name
  • volume: Volume count column name
  • dataset: Dataset reference

Optional Fields

  • segment: List of column names for grouping

Output Columns

The metric produces the following output columns:

  • group_key: Segmentation group identifier (struct of segment values)
  • volume: Total number of observations
  • defaults: Total number of defaults
  • odr: Observed Default Rate (Defaults/Volume)
  • pd: Mean Predicted Default probability
  • auc: Calculated AUC score (0.0 to 1.0)

Fan-out Examples

Multiple Discrimination Tests

metrics:
  auc_analysis:
    metric_type: "auc"
    config:
      name: ["portfolio_auc", "product_auc", "region_auc", "vintage_auc"]
      segment: [null, ["product_type"], ["region"], ["origination_year"]]
      data_format: "record_level"
      prob_def: "risk_score"
      default: "default_indicator"
      dataset: "model_validation_data"

This creates four AUC metrics:

  1. Overall portfolio discrimination
  2. Discrimination by product type
  3. Discrimination by region
  4. Discrimination by origination vintage

Model Comparison

metrics:
  model_auc_comparison:
    metric_type: "auc"
    config:
      name: ["champion_model", "challenger_model"]
      segment: [null, null]
      data_format: "record_level"
      prob_def: "champion_score" # Note: Would need separate configs for different score columns
      default: "default_flag"
      dataset: "ab_test_data"

  # Separate config for challenger model
  challenger_auc:
    metric_type: "auc"
    config:
      name: ["challenger_model_score"]
      data_format: "record_level"
      prob_def: "challenger_score"
      default: "default_flag"
      dataset: "ab_test_data"

Summary-Level Analysis

metrics:
  risk_grade_analysis:
    metric_type: "auc"
    config:
      name: ["overall_grade_auc", "product_grade_auc"]
      segment: [null, ["product_type"]]
      data_format: "summary_level"
      mean_pd: "grade_mean_pd"
      defaults: "grade_defaults"
      volume: "grade_volume"
      dataset: "risk_grade_stats"

Data Requirements

Record-Level Data

  • One row per loan/account
  • Probability column: numeric values between 0.0 and 1.0
  • Default column: binary values (0/1 or boolean)
  • Sufficient data points for meaningful AUC calculation (minimum ~20 observations recommended)

Summary-Level Data

  • One row per risk grade or aggregated group
  • Data should be ordered by risk (mean_pd column used for ordering)
  • Mean probabilities: numeric values between 0.0 and 1.0
  • Default counts: positive numbers or None (negative values not allowed)
  • Volume counts: positive numbers or None (negative values not allowed)
  • At least 2 risk grades with both defaults and non-defaults

AUC Interpretation

  • AUC > 0.8: Excellent discrimination
  • AUC > 0.7: Good discrimination
  • AUC > 0.6: Acceptable discrimination
  • AUC ≤ 0.6: Poor discrimination
  • AUC = 0.5: No discrimination (random model)

Important Notes

  1. Data Quality: Remove accounts with missing or invalid probability scores
  2. Sample Size: Larger samples provide more reliable AUC estimates
  3. Population Stability: AUC can vary across different populations or time periods
  4. Risk Ordering: For summary-level data, ensure groups are properly risk-ordered