Skip to content

Kolmogorov-Smirnov (KS) Statistic Metric

The kolmogorov_smirnov metric calculates the Kolmogorov-Smirnov statistic, measuring the maximum difference between cumulative distribution functions of predicted scores for defaulters vs non-defaulters.

Metric Type: kolmogorov_smirnov

Kolmogorov-Smirnov Calculation

The KS statistic measures discrimination power by finding the maximum separation between distributions:

  • 1.0: Perfect discrimination (complete separation)
  • 0.5: Good discrimination
  • 0.0: No discrimination (identical distributions)

The metric also provides a p-value indicating the statistical significance of the observed difference.

Configuration Fields

Record-Level Data Format

For individual loan/account records:

collections:
  ks_discrimination_analysis:
    metrics:
    - name:
      - model_ks_test
      data_format: record
      prob_def: model_score
      default: default_flag
      segment:
      - - product_type
      metric_type: kolmogorov_smirnov
    dataset: loan_portfolio

Summary-Level Data Format

For pre-aggregated risk-ordered data:

collections:
  summary_ks_analysis:
    metrics:
    - name:
      - risk_grade_ks
      data_format: summary
      mean_pd: avg_probability
      defaults: default_count
      volume: total_count
      segment:
      - - model_version
      metric_type: kolmogorov_smirnov
    dataset: risk_grade_summary

Required Fields by Format

Record-Level Required

  • name: Metric name(s)
  • data_format: Must be "record"
  • prob_def: Probability column name
  • default: Default indicator column name
  • dataset: Dataset reference

Summary-Level Required

  • name: Metric name(s)
  • data_format: Must be "summary"
  • mean_pd: Mean probability column name (used for risk ordering)
  • defaults: Default count column name
  • volume: Volume count column name
  • dataset: Dataset reference

Optional Fields

  • segment: List of column names for grouping

Output Columns

The metric produces the following output columns:

  • group_key: Segmentation group identifier (struct of segment values)
  • volume: Total number of observations
  • defaults: Total number of defaults
  • odr: Observed Default Rate (Defaults/Volume)
  • pd: Mean Predicted Default probability
  • ks_statistic: Kolmogorov-Smirnov statistic (0.0 to 1.0)
  • ks_p_value: P-value for the KS test

Fan-out Examples

Multiple KS Tests

collections:
  ks_analysis:
    metrics:
    - name:
      - portfolio_ks
      - product_ks
      - region_ks
      - vintage_ks
      segment:
      - null
      - - product_type
      - - region
      - - origination_year
      data_format: record
      prob_def: risk_score
      default: default_indicator
      metric_type: kolmogorov_smirnov
    dataset: model_validation_data

This creates four KS metrics:

  1. Overall portfolio discrimination test
  2. Discrimination test by product type
  3. Discrimination test by region
  4. Discrimination test by origination vintage

Model Comparison

collections:
  champion_ks:
    metrics:
    - name:
      - champion_model_ks
      data_format: record
      prob_def: champion_score
      default: default_flag
      metric_type: kolmogorov_smirnov
    dataset: ab_test_data
  challenger_ks:
    metrics:
    - name:
      - challenger_model_ks
      data_format: record
      prob_def: challenger_score
      default: default_flag
      metric_type: kolmogorov_smirnov
    dataset: ab_test_data

Summary-Level Analysis

collections:
  risk_grade_ks_analysis:
    metrics:
    - name:
      - overall_grade_ks
      - product_grade_ks
      segment:
      - null
      - - product_type
      data_format: summary
      mean_pd: grade_mean_pd
      defaults: grade_defaults
      volume: grade_volume
      metric_type: kolmogorov_smirnov
    dataset: risk_grade_stats

Data Requirements

Record-Level Data

  • One row per loan/account
  • Probability column: numeric values between 0.0 and 1.0
  • Default column: binary values (0/1 or boolean)
  • Sufficient data points for meaningful KS calculation (minimum ~30 observations recommended)
  • Both defaulters and non-defaulters must be present

Summary-Level Data

  • One row per risk grade or aggregated group
  • Data should be ordered by risk (mean_pd column used for ordering)
  • Mean probabilities: numeric values between 0.0 and 1.0
  • Default counts: positive numbers or None (negative values not allowed)
  • Volume counts: positive numbers or None (negative values not allowed)
  • At least 2 risk grades with both defaults and non-defaults

KS Statistic Interpretation

  • KS > 0.4: Excellent discrimination
  • KS > 0.3: Good discrimination
  • KS > 0.2: Acceptable discrimination
  • KS ≤ 0.2: Poor discrimination
  • KS = 0.0: No discrimination

P-Value Interpretation

  • p < 0.001: Highly significant difference
  • p < 0.01: Very significant difference
  • p < 0.05: Significant difference
  • p ≥ 0.05: No significant difference

Important Notes

  1. Data Quality: Remove accounts with missing or invalid probability scores
  2. Sample Size: Larger samples provide more reliable KS estimates and p-values
  3. Population Stability: KS can vary across different populations or time periods
  4. Risk Ordering: For summary-level data, ensure groups are properly risk-ordered
  5. Statistical Significance: Consider both KS statistic magnitude and p-value
  6. Complementary to AUC: KS provides different insights than AUC - use both for comprehensive discrimination assessment

Comparison with AUC

While both KS and AUC measure discrimination:

  • KS: Measures maximum separation between distributions at any point
  • AUC: Measures overall ranking ability across all thresholds
  • KS: More sensitive to local differences
  • AUC: More robust overall measure

Use both metrics together for a complete discrimination assessment.