Skip to content

Contributing

We'd love you to contribute to the TNP Statistic Library!

Issues

Questions, feature requests and bug reports are all welcome as discussions or issues.

To make it as simple as possible for us to help you, please include the output of the following call in your issue:

python -c "import tnp_statistic_library.version; print(tnp_statistic_library.version.version_info())"

Please try to always include the above unless you're unable to install the TNP Statistic Library or know it's not relevant to your question or feature request.

Adding New Metrics

This guide walks you through the complete process of adding a new statistical metric to the TNP Statistic Library. The library follows a consistent pattern that makes it straightforward to add new metrics while maintaining code quality and consistency.

Overview of the Metric Architecture

The library uses a layered architecture:

  1. Internal Implementation (tnp_statistic_library/_internal/metrics/): Core metric classes and configuration
  2. Public API (tnp_statistic_library/metrics/): User-friendly helper functions
  3. Tests (tests/metrics/): Comprehensive test coverage
  4. Documentation (docs/): API documentation and examples

Step 1: Design Your Metric

Before coding, define:

  • Metric name: Use descriptive, lowercase names with underscores (e.g., my_custom_metric)
  • Data formats supported: Decide if your metric supports:
  • record: Individual observation data
  • summary: Pre-aggregated summary data
  • Both formats
  • Required inputs: What columns/parameters does your metric need?
  • Outputs: What statistics will your metric return?

Create reusable helper functions for your calculations. These should:

  • Be pure functions that operate on Polars expressions
  • Handle edge cases (zero division, null values, etc.)
  • Be placed at the top of your metric module
  • Follow the naming convention: calculate_{metric}_expressions for shared logic

Example helper function structure:

import polars as pl

def calculate_my_metric_expressions(input_expr1: pl.Expr, input_expr2: pl.Expr) -> dict[str, pl.Expr]:
    """Calculate shared expressions for my metric.

    Args:
        input_expr1: First input expression
        input_expr2: Second input expression

    Returns:
        Dictionary mapping output column names to Polars expressions
    """
    # Handle division by zero using Polars when() expression
    result_expr = pl.when(input_expr2 != 0).then(input_expr1 / input_expr2).otherwise(None)

    return {
        "input1": input_expr1,
        "input2": input_expr2,
        "result": result_expr,
    }

Step 3: Create Configuration Classes

Define one config per data format using the core registry and validation markers. Each config inherits from MetricConfigBase and is registered with @register_config.

Marker Cheat Sheet

Column fields use marker metadata (instead of Column(...)) to describe constraints. Markers can be combined inside Annotated[...].

Common markers:

  • Numeric() - numeric dtype
  • Probability() - numeric in [0, 1]
  • Indicator() - binary 0/1 or boolean
  • Positive() - values > 0 (combine with Numeric())
  • NonNegative() - values >= 0 (combine with Numeric())
  • Nullable() - allow nulls
  • Segment() - optional list of group-by columns
  • Ge(x), Le(x), Gt(x), Lt(x), In(values) - value constraints
  • DtypeOf(check) - custom dtype predicate

Example:

score: Annotated[str, Numeric(), Positive()]
pd: Annotated[str, Probability()]
default: Annotated[str, Indicator()]
segment: Annotated[list[str] | None, Segment()] = None
import polars as pl
from typing import Annotated, Literal

from tnp_statistic_library._core.metrics._common import _formula_agg
from tnp_statistic_library._core.registry import (
    MetricConfigBase,
    Numeric,
    Positive,
    Segment,
    register_config,
)


@register_config
class RecordLevelMyMetricConfig(MetricConfigBase):
    """Configuration for record-level my metric calculation."""

    type: Literal["my_metric"] = "my_metric"
    data_format: Literal["record"] = "record"
    input_col1: Annotated[str, Numeric()]
    input_col2: Annotated[str, Numeric()]
    segment: Annotated[list[str] | None, Segment()] = None

    def expressions(self) -> dict[str, pl.Expr]:
        return calculate_my_metric_expressions(
            pl.col(self.input_col1),
            pl.col(self.input_col2),
        )

    def compute(self, lf: pl.LazyFrame, segment: list[str] | None) -> pl.LazyFrame:
        return _formula_agg(lf, segment, self.expressions())


@register_config
class SummaryLevelMyMetricConfig(MetricConfigBase):
    """Configuration for summary-level my metric calculation."""

    type: Literal["my_metric"] = "my_metric"
    data_format: Literal["summary"] = "summary"
    sum_col1: Annotated[str, Numeric(), Positive()]
    mean_col2: Annotated[str, Numeric()]
    segment: Annotated[list[str] | None, Segment()] = None

    def expressions(self) -> dict[str, pl.Expr]:
        return calculate_my_metric_expressions(
            pl.col(self.sum_col1).sum(),
            pl.col(self.mean_col2).mean(),
        )

    def compute(self, lf: pl.LazyFrame, segment: list[str] | None) -> pl.LazyFrame:
        return _formula_agg(lf, segment, self.expressions())

Step 4: Add the Core Function

Expose a public function in tnp_statistic_library/_core/metrics/<category>.py that delegates to compute_metric:

from typing import Literal
import polars as pl

from tnp_statistic_library._core.metrics._common import compute_metric


def my_metric(
    data: pl.LazyFrame | pl.DataFrame,
    *,
    data_format: Literal["record", "summary"] = "record",
    input_col1: str | None = None,
    input_col2: str | None = None,
    sum_col1: str | None = None,
    mean_col2: str | None = None,
    segment: list[str] | None = None,
) -> pl.LazyFrame:
    return compute_metric(
        data,
        metric_type="my_metric",
        data_format=data_format,
        segment=segment,
        input_col1=input_col1,
        input_col2=input_col2,
        sum_col1=sum_col1,
        mean_col2=mean_col2,
    )

Step 5: Re-export in the Public API

In tnp_statistic_library/metrics/<category>.py, re-export the core function:

from tnp_statistic_library._core.metrics.<category> import my_metric

__all__ = ["my_metric"]

Step 6: Write Comprehensive Tests

Create tests under tests/core/metrics/ alongside the other core metric tests:

import polars as pl
import pytest
from polars.testing import assert_frame_equal
from tnp_statistic_library.errors import ValidationError
from tnp_statistic_library.metrics.summary import my_metric

class TestMyMetric:
    """Test class for MyMetric with comprehensive coverage."""

    def test_my_metric_record_without_segments(self):
        """Test with record-level data without segments."""
        result = my_metric(
            data=pl.DataFrame({
                "col1": [1, 2, 3, 4, 5],
                "col2": [2, 4, 6, 8, 10],
            }),
            data_format="record",
            input_col1="col1",
            input_col2="col2",
        )

        assert result.shape == (1, 4)  # group_key + your outputs
        assert result["result"][0] == 0.5  # Expected result

    def test_my_metric_with_segments(self):
        """Test with segmented data."""
        result = my_metric(
            data=pl.DataFrame({
                "col1": [1, 2, 3, 4],
                "col2": [2, 4, 6, 8],
                "segment": ["A", "A", "B", "B"],
            }),
            data_format="record",
            input_col1="col1",
            input_col2="col2",
            segment=["segment"],
        )

        assert result.shape == (2, 4)
        assert set(result["segment"].to_list()) == {"A", "B"}

    def test_my_metric_edge_cases(self):
        """Test edge cases like zero division, null values, etc."""
        # Test with zero values
        result = my_metric(
            data=pl.DataFrame({
                "col1": [0, 1, 2],
                "col2": [0, 0, 1],
            }),
            data_format="record",
            input_col1="col1",
            input_col2="col2",
        )

        # Should handle division by zero gracefully
        assert result["result"].null_count() > 0

    def test_my_metric_summary(self):
        """Test with summary-level data."""
        result = my_metric(
            data=pl.DataFrame({
                "sum_col1": [10, 20],
                "mean_col2": [2.0, 4.0],
            }),
            data_format="summary",
            sum_col1="sum_col1",
            mean_col2="mean_col2",
        )

        assert result.shape == (1, 4)

    def test_my_metric_validation_errors(self):
        """Test that appropriate validation errors are raised."""
        with pytest.raises(ValidationError):
            my_metric(
                data=pl.DataFrame({
                    "col1": ["a", "b", "c"],  # Non-numeric data
                    "col2": [1, 2, 3],
                }),
                data_format="record",
                input_col1="col1",
                input_col2="col2",
            )

Step 7: Add Documentation

Create documentation in the appropriate API file:

  1. API Documentation: Add to the relevant file in docs/api/ (e.g., docs/api/accuracy.md)
  2. Recipe Examples: Add practical examples in docs/recipes/
  3. Update the index: Ensure your new metric is listed in the appropriate index

Step 8: Validate Your Implementation

Run these commands to ensure your metric is properly implemented:

# Run tests for your specific metric
uv run pytest tests/core/metrics/test_{your_module}.py::TestYourMetric -v

# Run all tests to ensure no regressions
uv run pytest

# Run type checking (if available)
uv run mypy tnp_statistic_library/_core/metrics/{your_module}.py

# Run linting
uv run ruff check tnp_statistic_library/_core/metrics/{your_module}.py

# Test documentation builds
uv run mkdocs serve

You can also use the just command runner to run these commands easily:

just all

Common Patterns and Best Practices

1. Handling Edge Cases

Always handle these scenarios:

  • Division by zero using pl.when(denominator != 0).then(numerator / denominator).otherwise(None)
  • Null/missing values
  • Empty datasets
  • Insufficient sample sizes
  • Invalid input ranges

2. Column Validation

Use appropriate validators for input columns:

  • IsNumeric(): Ensures column contains numeric data
  • PositiveNumber(): Ensures positive numeric values
  • InclusiveRange(min, max): Ensures values fall within range
  • IsIndicator(): Ensures binary indicator values (0/1)

3. Segmentation Support

All metrics support segmentation by default. The segment parameter allows users to group results by specified columns.

4. LazyFrame Support

All calculations use Polars LazyFrames for performance. Your compute method should return a LazyFrame that can be collected when needed.

5. Naming Conventions

  • Metric names: Use lowercase with underscores (e.g., my_custom_metric)
  • Configuration classes: Use descriptive names ending with Config
  • Helper functions: Use descriptive names with calculate_ prefix
  • Test classes: Use Test{MetricName} format

Example: Real Metric Implementation

Here's how the existing mean metric is implemented as a reference:

# From tnp_statistic_library/_core/metrics/summary/mean.py

import polars as pl
from typing import Annotated, Literal

from tnp_statistic_library._core.metrics._common import _formula_agg
from tnp_statistic_library._core.metrics.summary.mean import mean as _mean, mean_formula
from tnp_statistic_library._core.registry import (
    MetricConfigBase,
    Numeric,
    Segment,
    register_config,
)


@register_config
class MeanConfig(MetricConfigBase):
    """Configuration for computing the mean statistic."""

    type: Literal["mean"] = "mean"
    data_format: Literal["record"] = "record"
    variable: Annotated[str, Numeric()]
    segment: Annotated[list[str] | None, Segment()] = None

    def expressions(self) -> dict[str, pl.Expr]:
        return {
            "variable_name": pl.lit(self.variable),
            "mean_value": pl.col(self.variable).mean(),
        }

    def compute(self, lf: pl.LazyFrame, segment: list[str] | None) -> pl.LazyFrame:
        return _formula_agg(lf, segment, mean_formula(**self.expressions()))

# From tnp_statistic_library/metrics/summary.py

def mean(
    data: pl.LazyFrame | pl.DataFrame,
    *,
    variable: str,
    segment: list[str] | None = None,
) -> pl.DataFrame:
    """Calculate the mean of a variable."""
    return _mean(data, variable=variable, segment=segment).collect()

This comprehensive guide should help you successfully add new metrics while maintaining consistency with the existing codebase. If you have questions or need clarification on any step, please open an issue on GitHub (recommended) or add a post on the Teams Channel