Skip to content

Normality Testing Metrics

This module provides statistical tests for assessing whether data follows a normal distribution, which is a key assumption for many statistical methods and model validation techniques.

Available Functions

Shapiro-Wilk Test

shapiro_wilk module-attribute

shapiro_wilk = collect_metric(shapiro_wilk)

Usage Examples

Record-Level Data

Test normality of individual observations:

import polars as pl
from tnp_statistic_library.metrics.normality import shapiro_wilk

# Create sample data
df = pl.DataFrame({
    "residuals": [0.1, -0.2, 0.05, 0.3, -0.1, 0.2, -0.05, 0.15],
    "model_version": ["v1", "v1", "v1", "v1", "v2", "v2", "v2", "v2"]
})

# Test normality of residuals by model version
result = shapiro_wilk(
    data=df,
    data_format="record",
    data_column="residuals",
    segment=["model_version"]
)

print(result)

Summary-Level Data

Work with pre-computed normality test statistics:

# Pre-aggregated normality test results
df_summary = pl.DataFrame({
    "volume": [100, 150, 80],
    "statistic": [0.95, 0.88, 0.92],
    "p_value": [0.08, 0.02, 0.05],
    "data_source": ["training", "validation", "test"]
})

# Aggregate normality results across data sources
result = shapiro_wilk(
    data=df_summary,
    data_format="summary",
    volume="volume",
    statistic="statistic",
    p_value="p_value",
    segment=["data_source"]
)

Data Format Requirements

Record-Level Data

For testing normality of individual observations:

  • data_column: Column containing numeric values to test for normality
  • Optional: segment columns for group-wise testing
  • Minimum: 3 observations per group (Shapiro-Wilk requirement)
  • Maximum: 5000 observations per group (scipy limitation)

Summary-Level Data

For aggregating pre-computed test statistics:

  • volume: Number of observations used in the original test
  • statistic: The W statistic from Shapiro-Wilk test (0.0-1.0)
  • p_value: The p-value from the normality test (0.0-1.0)
  • Optional: segment columns for group identification

Output Columns

All normality functions return:

  • group_key: Segmentation group identifier (struct of segment values)
  • volume: Number of observations tested
  • statistic: Test statistic value (higher values indicate more normal-like data)
  • p_value: Statistical significance of the test

Interpretation Guidelines

Shapiro-Wilk Test Results

  • Null Hypothesis (H0): The data follows a normal distribution
  • Alternative Hypothesis (H1): The data does not follow a normal distribution

Decision Rules

Choose your significance level (alpha) based on your requirements:

  • p_value < alpha: Reject H0 - Evidence against normality
  • p_value >= alpha: Fail to reject H0 - Insufficient evidence against normality

Common Alpha Values

  • 0.05 (5%): Standard significance level for most applications
  • 0.01 (1%): More stringent threshold for critical applications
  • 0.10 (10%): More lenient threshold for exploratory analysis

Practical Guidelines

p-value Range Interpretation Action
p ≥ 0.10 Strong evidence for normality Safe to assume normality
0.05 ≤ p < 0.10 Weak evidence against normality Consider normality assumption carefully
0.01 ≤ p < 0.05 Moderate evidence against normality Likely not normal, consider alternatives
p < 0.01 Strong evidence against normality Data is not normal, use non-parametric methods

Limitations and Considerations

Sample Size Constraints

  • Minimum: 3 observations required for Shapiro-Wilk test
  • Maximum: 5000 observations (scipy implementation limit)
  • Optimal: 20-500 observations for best test power

Sensitivity Factors

  • Outliers: Shapiro-Wilk is sensitive to extreme values
  • Ties: Repeated values can affect test performance
  • Sample Size: Very large samples may reject normality for trivial deviations

When to Use Normality Tests

  1. Pre-analysis Validation: Before applying parametric statistical tests
  2. Model Residual Analysis: Testing assumptions for regression models
  3. Quality Control: Monitoring data distribution consistency
  4. Method Selection: Choosing between parametric and non-parametric approaches

Alternative Approaches

If normality is rejected, consider:

  • Visual Methods: Q-Q plots, histograms, density plots
  • Robust Statistics: Methods that don't assume normality
  • Data Transformation: Log, square root, or Box-Cox transformations
  • Non-parametric Tests: Wilcoxon, Mann-Whitney, Kruskal-Wallis

Best Practices

Data Preparation

  1. Remove Outliers: Consider outlier treatment before testing
  2. Sufficient Sample Size: Ensure adequate observations for reliable results
  3. Segmentation Strategy: Test normality within homogeneous groups
  4. Missing Data: Handle missing values appropriately

Result Interpretation

  1. Multiple Comparisons: Adjust significance levels when testing multiple groups
  2. Practical Significance: Consider effect size, not just statistical significance
  3. Domain Context: Apply domain knowledge to interpretation
  4. Complementary Analysis: Use with visual diagnostic tools

Common Applications in Credit Risk

# Example: Testing normality of model residuals
residuals_df = pl.DataFrame({
    "residuals": model_residuals,
    "time_period": time_periods,
    "product_type": product_types
})

# Test residual normality by product and time
normality_test = shapiro_wilk(
    data=residuals_df,
    data_format="record",
    data_column="residuals",
    segment=["product_type", "time_period"]
)

results = normality_test

# Check which segments fail normality assumption
non_normal_segments = results.filter(pl.col("p_value") < 0.05)

This documentation provides comprehensive guidance for using normality testing metrics to validate statistical assumptions and ensure appropriate method selection in your analysis recipes.