Accuracy Metrics¶
Accuracy metrics for model validation and performance assessment.
accuracy ¶
Accuracy metrics helper functions.
This module provides convenient helper functions for accuracy-related statistical metrics.
default_accuracy ¶
default_accuracy(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
**kwargs: Any,
) -> pl.DataFrame
Calculate default accuracy for record-level or summary-level data.
Record-level usage (data_format="record_level"): Required parameters: prob_def, default
Summary-level usage (data_format="summary_level"): Required parameters: mean_pd, defaults, volume
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the metric. |
required |
dataset
|
LazyFrame | DataFrame
|
Dataset to compute the default accuracy on. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data ("record_level" or "summary_level"). |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. For record_level: prob_def (str), default (str), segment (optional) For summary_level: mean_pd (str), defaults (str), volume (str), segment (optional) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing default accuracy metrics for each group. |
Examples:
Record-level usage:
result = default_accuracy(
name="model_accuracy",
dataset=df,
data_format="record_level",
prob_def="probability",
default="default_flag"
)
Summary-level usage:
ead_accuracy ¶
ead_accuracy(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
predicted_ead: str,
actual_ead: str,
**kwargs: Any,
) -> pl.DataFrame
Calculate EAD accuracy for record-level or summary-level data.
Record-level usage (data_format="record_level"): Required parameters: default
Summary-level usage (data_format="summary_level"): Required parameters: defaults, volume
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the metric. |
required |
dataset
|
LazyFrame | DataFrame
|
Dataset to compute the EAD accuracy on. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data ("record_level" or "summary_level"). |
required |
predicted_ead
|
str
|
Column containing predicted EAD values. |
required |
actual_ead
|
str
|
Column containing actual EAD values. |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. For record_level: default (str), segment (optional) For summary_level: defaults (str), volume (str), segment (optional) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing EAD accuracy metrics for each group. |
Examples:
Record-level usage:
result = ead_accuracy(
name="ead_model_accuracy",
dataset=df,
data_format="record_level",
predicted_ead="predicted_ead",
actual_ead="actual_ead",
default="default_flag"
)
Summary-level usage:
hosmer_lemeshow ¶
hosmer_lemeshow(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
**kwargs: Any,
) -> pl.DataFrame
Calculate the Hosmer-Lemeshow metric for record-level or summary-level data.
Record-level usage (data_format="record_level"): Required parameters: prob_def, default
Summary-level usage (data_format="summary_level"): Required parameters: mean_pd, defaults, volume
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the metric. |
required |
dataset
|
LazyFrame | DataFrame
|
Dataset to compute the Hosmer-Lemeshow test on. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data ("record_level" or "summary_level"). |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. For record_level: prob_def (str), default (str), bands (int, default=10), segment (optional) For summary_level: mean_pd (str), defaults (str), volume (str), bands (int, default=10), segment (optional) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing the Hosmer-Lemeshow test result and associated metadata. |
Examples:
Record-level usage:
result = hosmer_lemeshow(
name="hl_test",
dataset=df,
data_format="record_level",
prob_def="probability",
default="default_flag",
bands=10
)
Summary-level usage:
jeffreys_test ¶
jeffreys_test(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
**kwargs: Any,
) -> pl.DataFrame
Calculate the Jeffreys test metric for record-level or summary-level data.
Record-level usage (data_format="record_level"): Required parameters: prob_def, default
Summary-level usage (data_format="summary_level"): Required parameters: mean_pd, defaults, volume
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the metric. |
required |
dataset
|
LazyFrame | DataFrame
|
Dataset to compute the Jeffreys test on. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data ("record_level" or "summary_level"). |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. For record_level: prob_def (str), default (str), segment (optional) For summary_level: mean_pd (str), defaults (str), volume (str), segment (optional) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing the Jeffreys test result and associated metadata. |
Examples:
Record-level usage:
result = jeffreys_test(
name="jeffreys_test",
dataset=df,
data_format="record_level",
prob_def="probability",
default="default_flag"
)
Summary-level usage:
rmse ¶
rmse(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
**kwargs: Any,
) -> pl.DataFrame
Calculate Root Mean Squared Error (RMSE) for record-level or summary-level data.
Record-level usage (data_format="record_level"): Required parameters: observed, predicted
Summary-level usage (data_format="summary_level"): Required parameters: volume, sum_squared_errors
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the metric. |
required |
dataset
|
LazyFrame | DataFrame
|
Dataset to compute the RMSE on. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data ("record_level" or "summary_level"). |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. For record_level: observed (str), predicted (str), segment (optional) For summary_level: volume (str), sum_squared_errors (str), segment (optional) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing RMSE metrics for each group. |
Examples:
Record-level usage:
result = rmse(
name="model_rmse",
dataset=df,
data_format="record_level",
observed="observed_values",
predicted="predicted_values"
)
Summary-level usage:
mape ¶
mape(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
**kwargs: Any,
) -> pl.DataFrame
Calculate Mean Absolute Percentage Error (MAPE) for record-level or summary-level data.
Record-level usage (data_format="record_level"): Required parameters: observed, predicted
Summary-level usage (data_format="summary_level"): Required parameters: volume, sum_absolute_percentage_errors
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the metric. |
required |
dataset
|
LazyFrame | DataFrame
|
Dataset to compute the MAPE on. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data ("record_level" or "summary_level"). |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. For record_level: observed (str), predicted (str), segment (optional) For summary_level: volume (str), sum_absolute_percentage_errors (str), segment (optional) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing MAPE metrics for each group. |
Examples:
Record-level usage:
result = mape(
name="model_mape",
dataset=df,
data_format="record_level",
observed="observed_values",
predicted="predicted_values"
)
Summary-level usage:
ttest ¶
ttest(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
**kwargs: Any,
) -> pl.DataFrame
Calculate T-test statistics for record-level or summary-level data.
Performs a one-sample t-test to determine if the mean difference between observed and predicted values is significantly different from a null hypothesis mean.
Record-level usage (data_format="record_level"): Required parameters: observed, predicted Optional parameters: null_hypothesis_mean (default: 0.0)
Summary-level usage (data_format="summary_level"): Required parameters: volume, sum_differences, sum_squared_differences Optional parameters: null_hypothesis_mean (default: 0.0)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the metric. |
required |
dataset
|
LazyFrame | DataFrame
|
Dataset to compute the T-test on. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data ("record_level" or "summary_level"). |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. For record_level: observed (str), predicted (str), null_hypothesis_mean (float), segment (optional) For summary_level: volume (str), sum_differences (str), sum_squared_differences (str), null_hypothesis_mean (float), segment (optional) |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing T-test statistics for each group. |
Examples:
Record-level usage:
result = ttest(
name="model_ttest",
dataset=df,
data_format="record_level",
observed="observed_values",
predicted="predicted_values"
)
Summary-level usage:
binomial_test ¶
binomial_test(
name: str,
dataset: LazyFrame | DataFrame,
data_format: Literal["record_level", "summary_level"],
**kwargs: Any,
) -> pl.DataFrame
Calculate binomial test for record-level or summary-level data.
The binomial test is used to test whether an observed proportion of defaults significantly differs from an expected probability under the null hypothesis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name identifier for the metric calculation. |
required |
dataset
|
LazyFrame | DataFrame
|
The input data as a Polars LazyFrame or DataFrame. |
required |
data_format
|
Literal['record_level', 'summary_level']
|
Format of the input data, either "record_level" or "summary_level". |
required |
**kwargs
|
Any
|
Additional keyword arguments specific to the data format. |
{}
|
Record-level format kwargs
default: Column name containing binary default indicators (0/1 or boolean). expected_probability: Expected probability of default under null hypothesis (0.0-1.0). segment: Optional list of column names to group by for segmented analysis.
Summary-level format kwargs
volume: Column name containing the total number of observations. defaults: Column name containing the number of defaults. expected_probability: Expected probability of default under null hypothesis (0.0-1.0). segment: Optional list of column names to group by for segmented analysis.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pl.DataFrame: A DataFrame containing binomial test results with columns: - group_key: Grouping information (struct of segment columns) - volume: Total number of observations - defaults: Number of observed defaults - observed_probability: Observed default rate - expected_probability: Expected default rate under null hypothesis - p_value: Two-tailed p-value from binomial test |
Examples:
Record-level data:
binomial_test(
name="default_rate_test",
dataset=data,
data_format="record_level",
default="default_flag",
expected_probability=0.05
)
Summary-level data:
options: show_source: false heading_level: 2 members_order: source