AUC (Area Under Curve) Metric¶
The auc metric calculates the Area Under the ROC Curve, measuring a model's ability to discriminate between defaults and non-defaults.
Metric Type: auc
AUC Calculation¶
AUC measures discrimination power by calculating the area under the Receiver Operating Characteristic (ROC) curve:
- 1.0: Perfect discrimination
- 0.5: No discrimination (random)
- 0.0: Perfect inverse discrimination
Configuration Fields¶
Record-Level Data Format¶
For individual loan/account records:
collections:
discrimination_analysis:
metrics:
- name:
- model_discrimination
data_format: record
prob_def: model_score
default: default_flag
segment:
- - product_type
metric_type: auc
dataset: loan_portfolio
Summary-Level Data Format¶
For pre-aggregated risk-ordered data:
collections:
summary_discrimination:
metrics:
- name:
- risk_grade_auc
data_format: summary
mean_pd: avg_probability
defaults: default_count
volume: total_count
segment:
- - model_version
metric_type: auc
dataset: risk_grade_summary
Required Fields by Format¶
Record-Level Required¶
name: Metric name(s)data_format: Must be "record"prob_def: Probability column namedefault: Default indicator column namedataset: Dataset reference
Summary-Level Required¶
name: Metric name(s)data_format: Must be "summary"mean_pd: Mean probability column name (used for risk ordering)defaults: Default count column namevolume: Volume count column namedataset: Dataset reference
Optional Fields¶
segment: List of column names for grouping
Output Columns¶
The metric produces the following output columns:
group_key: Segmentation group identifier (struct of segment values)volume: Total number of observationsdefaults: Total number of defaultsodr: Observed Default Rate (Defaults/Volume)pd: Mean Predicted Default probabilityauc: Calculated AUC score (0.0 to 1.0)
Fan-out Examples¶
Multiple Discrimination Tests¶
collections:
auc_analysis:
metrics:
- name:
- portfolio_auc
- product_auc
- region_auc
- vintage_auc
segment:
- null
- - product_type
- - region
- - origination_year
data_format: record
prob_def: risk_score
default: default_indicator
metric_type: auc
dataset: model_validation_data
This creates four AUC metrics:
- Overall portfolio discrimination
- Discrimination by product type
- Discrimination by region
- Discrimination by origination vintage
Model Comparison¶
collections:
model_auc_comparison:
metrics:
- name:
- champion_model
- challenger_model
segment:
- null
- null
data_format: record
prob_def: champion_score
default: default_flag
metric_type: auc
dataset: ab_test_data
challenger_auc:
metrics:
- name:
- challenger_model_score
data_format: record
prob_def: challenger_score
default: default_flag
metric_type: auc
dataset: ab_test_data
Summary-Level Analysis¶
collections:
risk_grade_analysis:
metrics:
- name:
- overall_grade_auc
- product_grade_auc
segment:
- null
- - product_type
data_format: summary
mean_pd: grade_mean_pd
defaults: grade_defaults
volume: grade_volume
metric_type: auc
dataset: risk_grade_stats
Data Requirements¶
Record-Level Data¶
- One row per loan/account
- Probability column: numeric values between 0.0 and 1.0
- Default column: binary values (0/1 or boolean)
- Sufficient data points for meaningful AUC calculation (minimum ~20 observations recommended)
Summary-Level Data¶
- One row per risk grade or aggregated group
- Data should be ordered by risk (mean_pd column used for ordering)
- Mean probabilities: numeric values between 0.0 and 1.0
- Default counts: positive numbers or None (negative values not allowed)
- Volume counts: positive numbers or None (negative values not allowed)
- At least 2 risk grades with both defaults and non-defaults
AUC Interpretation¶
- AUC > 0.8: Excellent discrimination
- AUC > 0.7: Good discrimination
- AUC > 0.6: Acceptable discrimination
- AUC ≤ 0.6: Poor discrimination
- AUC = 0.5: No discrimination (random model)
Important Notes¶
- Data Quality: Remove accounts with missing or invalid probability scores
- Sample Size: Larger samples provide more reliable AUC estimates
- Population Stability: AUC can vary across different populations or time periods
- Risk Ordering: For summary-level data, ensure groups are properly risk-ordered