Kolmogorov-Smirnov (KS) Statistic Metric¶
The kolmogorov_smirnov metric calculates the Kolmogorov-Smirnov statistic, measuring the maximum difference between cumulative distribution functions of predicted scores for defaulters vs non-defaulters.
Metric Type: kolmogorov_smirnov
Kolmogorov-Smirnov Calculation¶
The KS statistic measures discrimination power by finding the maximum separation between distributions:
- 1.0: Perfect discrimination (complete separation)
- 0.5: Good discrimination
- 0.0: No discrimination (identical distributions)
The metric also provides a p-value indicating the statistical significance of the observed difference.
Configuration Fields¶
Record-Level Data Format¶
For individual loan/account records:
metrics:
ks_discrimination_analysis:
metric_type: "kolmogorov_smirnov"
config:
name: ["model_ks_test"]
data_format: "record_level"
prob_def: "model_score" # Column with predicted probabilities (0.0-1.0)
default: "default_flag" # Column with default indicators (0/1 or boolean)
segment: [["product_type"]] # Optional: segmentation columns
dataset: "loan_portfolio"
Summary-Level Data Format¶
For pre-aggregated risk-ordered data:
metrics:
summary_ks_analysis:
metric_type: "kolmogorov_smirnov"
config:
name: ["risk_grade_ks"]
data_format: "summary_level"
mean_pd: "avg_probability" # Column with mean probabilities (for ordering)
defaults: "default_count" # Column with default counts
volume: "total_count" # Column with total observation counts
segment: [["model_version"]] # Optional: segmentation columns
dataset: "risk_grade_summary"
Required Fields by Format¶
Record-Level Required¶
name: Metric name(s)data_format: Must be "record_level"prob_def: Probability column namedefault: Default indicator column namedataset: Dataset reference
Summary-Level Required¶
name: Metric name(s)data_format: Must be "summary_level"mean_pd: Mean probability column name (used for risk ordering)defaults: Default count column namevolume: Volume count column namedataset: Dataset reference
Optional Fields¶
segment: List of column names for grouping
Output Columns¶
The metric produces the following output columns:
group_key: Segmentation group identifier (struct of segment values)volume: Total number of observationsdefaults: Total number of defaultsodr: Observed Default Rate (Defaults/Volume)pd: Mean Predicted Default probabilityks_statistic: Kolmogorov-Smirnov statistic (0.0 to 1.0)ks_pvalue: P-value for the KS test
Fan-out Examples¶
Multiple KS Tests¶
metrics:
ks_analysis:
metric_type: "kolmogorov_smirnov"
config:
name: ["portfolio_ks", "product_ks", "region_ks", "vintage_ks"]
segment: [null, ["product_type"], ["region"], ["origination_year"]]
data_format: "record_level"
prob_def: "risk_score"
default: "default_indicator"
dataset: "model_validation_data"
This creates four KS metrics:
- Overall portfolio discrimination test
- Discrimination test by product type
- Discrimination test by region
- Discrimination test by origination vintage
Model Comparison¶
metrics:
champion_ks:
metric_type: "kolmogorov_smirnov"
config:
name: ["champion_model_ks"]
data_format: "record_level"
prob_def: "champion_score"
default: "default_flag"
dataset: "ab_test_data"
# Separate config for challenger model
challenger_ks:
metric_type: "kolmogorov_smirnov"
config:
name: ["challenger_model_ks"]
data_format: "record_level"
prob_def: "challenger_score"
default: "default_flag"
dataset: "ab_test_data"
Summary-Level Analysis¶
metrics:
risk_grade_ks_analysis:
metric_type: "kolmogorov_smirnov"
config:
name: ["overall_grade_ks", "product_grade_ks"]
segment: [null, ["product_type"]]
data_format: "summary_level"
mean_pd: "grade_mean_pd"
defaults: "grade_defaults"
volume: "grade_volume"
dataset: "risk_grade_stats"
Data Requirements¶
Record-Level Data¶
- One row per loan/account
- Probability column: numeric values between 0.0 and 1.0
- Default column: binary values (0/1 or boolean)
- Sufficient data points for meaningful KS calculation (minimum ~30 observations recommended)
- Both defaulters and non-defaulters must be present
Summary-Level Data¶
- One row per risk grade or aggregated group
- Data should be ordered by risk (mean_pd column used for ordering)
- Mean probabilities: numeric values between 0.0 and 1.0
- Default counts: positive numbers or None (negative values not allowed)
- Volume counts: positive numbers or None (negative values not allowed)
- At least 2 risk grades with both defaults and non-defaults
KS Statistic Interpretation¶
- KS > 0.4: Excellent discrimination
- KS > 0.3: Good discrimination
- KS > 0.2: Acceptable discrimination
- KS ≤ 0.2: Poor discrimination
- KS = 0.0: No discrimination
P-Value Interpretation¶
- p < 0.001: Highly significant difference
- p < 0.01: Very significant difference
- p < 0.05: Significant difference
- p ≥ 0.05: No significant difference
Important Notes¶
- Data Quality: Remove accounts with missing or invalid probability scores
- Sample Size: Larger samples provide more reliable KS estimates and p-values
- Population Stability: KS can vary across different populations or time periods
- Risk Ordering: For summary-level data, ensure groups are properly risk-ordered
- Statistical Significance: Consider both KS statistic magnitude and p-value
- Complementary to AUC: KS provides different insights than AUC - use both for comprehensive discrimination assessment
Comparison with AUC¶
While both KS and AUC measure discrimination:
- KS: Measures maximum separation between distributions at any point
- AUC: Measures overall ranking ability across all thresholds
- KS: More sensitive to local differences
- AUC: More robust overall measure
Use both metrics together for a complete discrimination assessment.