Contributing¶
We'd love you to contribute to the TNP Statistic Library!
Issues¶
Questions, feature requests and bug reports are all welcome as discussions or issues.
To make it as simple as possible for us to help you, please include the output of the following call in your issue:
python -c "import tnp_statistic_library.version; print(tnp_statistic_library.version.version_info())"
Please try to always include the above unless you're unable to install the TNP Statistic Library or know it's not relevant to your question or feature request.
Adding New Metrics¶
This guide walks you through the complete process of adding a new statistical metric to the TNP Statistic Library. The library follows a consistent pattern that makes it straightforward to add new metrics while maintaining code quality and consistency.
Overview of the Metric Architecture¶
The library uses a layered architecture:
- Internal Implementation (
tnp_statistic_library/_internal/metrics/): Core metric classes and configuration - Public API (
tnp_statistic_library/metrics/): User-friendly helper functions - Tests (
tests/metrics/): Comprehensive test coverage - Documentation (
docs/): API documentation and examples
Step 1: Design Your Metric¶
Before coding, define:
- Metric name: Use descriptive, lowercase names with underscores (e.g.,
my_custom_metric) - Data formats supported: Decide if your metric supports:
record: Individual observation datasummary: Pre-aggregated summary data- Both formats
- Required inputs: What columns/parameters does your metric need?
- Outputs: What statistics will your metric return?
Step 2: Create Helper Functions (Recommended)¶
Create reusable helper functions for your calculations. These should:
- Be pure functions that operate on Polars expressions
- Handle edge cases (zero division, null values, etc.)
- Be placed at the top of your metric module
- Follow the naming convention:
calculate_{metric}_expressionsfor shared logic
Example helper function structure:
import polars as pl
def calculate_my_metric_expressions(input_expr1: pl.Expr, input_expr2: pl.Expr) -> dict[str, pl.Expr]:
"""Calculate shared expressions for my metric.
Args:
input_expr1: First input expression
input_expr2: Second input expression
Returns:
Dictionary mapping output column names to Polars expressions
"""
# Handle division by zero using Polars when() expression
result_expr = pl.when(input_expr2 != 0).then(input_expr1 / input_expr2).otherwise(None)
return {
"input1": input_expr1,
"input2": input_expr2,
"result": result_expr,
}
Step 3: Create Configuration Classes¶
Define one config per data format using the core registry and validation markers.
Each config inherits from MetricConfigBase and is registered with @register_config.
Marker Cheat Sheet¶
Column fields use marker metadata (instead of Column(...)) to describe constraints.
Markers can be combined inside Annotated[...].
Common markers:
Numeric()- numeric dtypeProbability()- numeric in [0, 1]Indicator()- binary 0/1 or booleanPositive()- values > 0 (combine withNumeric())NonNegative()- values >= 0 (combine withNumeric())Nullable()- allow nullsSegment()- optional list of group-by columnsGe(x),Le(x),Gt(x),Lt(x),In(values)- value constraintsDtypeOf(check)- custom dtype predicate
Example:
score: Annotated[str, Numeric(), Positive()]
pd: Annotated[str, Probability()]
default: Annotated[str, Indicator()]
segment: Annotated[list[str] | None, Segment()] = None
import polars as pl
from typing import Annotated, Literal
from tnp_statistic_library._core.metrics._common import _formula_agg
from tnp_statistic_library._core.registry import (
MetricConfigBase,
Numeric,
Positive,
Segment,
register_config,
)
@register_config
class RecordLevelMyMetricConfig(MetricConfigBase):
"""Configuration for record-level my metric calculation."""
type: Literal["my_metric"] = "my_metric"
data_format: Literal["record"] = "record"
input_col1: Annotated[str, Numeric()]
input_col2: Annotated[str, Numeric()]
segment: Annotated[list[str] | None, Segment()] = None
def expressions(self) -> dict[str, pl.Expr]:
return calculate_my_metric_expressions(
pl.col(self.input_col1),
pl.col(self.input_col2),
)
def compute(self, lf: pl.LazyFrame, segment: list[str] | None) -> pl.LazyFrame:
return _formula_agg(lf, segment, self.expressions())
@register_config
class SummaryLevelMyMetricConfig(MetricConfigBase):
"""Configuration for summary-level my metric calculation."""
type: Literal["my_metric"] = "my_metric"
data_format: Literal["summary"] = "summary"
sum_col1: Annotated[str, Numeric(), Positive()]
mean_col2: Annotated[str, Numeric()]
segment: Annotated[list[str] | None, Segment()] = None
def expressions(self) -> dict[str, pl.Expr]:
return calculate_my_metric_expressions(
pl.col(self.sum_col1).sum(),
pl.col(self.mean_col2).mean(),
)
def compute(self, lf: pl.LazyFrame, segment: list[str] | None) -> pl.LazyFrame:
return _formula_agg(lf, segment, self.expressions())
Step 4: Add the Core Function¶
Expose a public function in tnp_statistic_library/_core/metrics/<category>.py
that delegates to compute_metric:
from typing import Literal
import polars as pl
from tnp_statistic_library._core.metrics._common import compute_metric
def my_metric(
data: pl.LazyFrame | pl.DataFrame,
*,
data_format: Literal["record", "summary"] = "record",
input_col1: str | None = None,
input_col2: str | None = None,
sum_col1: str | None = None,
mean_col2: str | None = None,
segment: list[str] | None = None,
) -> pl.LazyFrame:
return compute_metric(
data,
metric_type="my_metric",
data_format=data_format,
segment=segment,
input_col1=input_col1,
input_col2=input_col2,
sum_col1=sum_col1,
mean_col2=mean_col2,
)
Step 5: Re-export in the Public API¶
In tnp_statistic_library/metrics/<category>.py, re-export the core function:
Step 6: Write Comprehensive Tests¶
Create tests under tests/core/metrics/ alongside the other core metric tests:
import polars as pl
import pytest
from polars.testing import assert_frame_equal
from tnp_statistic_library.errors import ValidationError
from tnp_statistic_library.metrics.summary import my_metric
class TestMyMetric:
"""Test class for MyMetric with comprehensive coverage."""
def test_my_metric_record_without_segments(self):
"""Test with record-level data without segments."""
result = my_metric(
data=pl.DataFrame({
"col1": [1, 2, 3, 4, 5],
"col2": [2, 4, 6, 8, 10],
}),
data_format="record",
input_col1="col1",
input_col2="col2",
)
assert result.shape == (1, 4) # group_key + your outputs
assert result["result"][0] == 0.5 # Expected result
def test_my_metric_with_segments(self):
"""Test with segmented data."""
result = my_metric(
data=pl.DataFrame({
"col1": [1, 2, 3, 4],
"col2": [2, 4, 6, 8],
"segment": ["A", "A", "B", "B"],
}),
data_format="record",
input_col1="col1",
input_col2="col2",
segment=["segment"],
)
assert result.shape == (2, 4)
assert set(result["segment"].to_list()) == {"A", "B"}
def test_my_metric_edge_cases(self):
"""Test edge cases like zero division, null values, etc."""
# Test with zero values
result = my_metric(
data=pl.DataFrame({
"col1": [0, 1, 2],
"col2": [0, 0, 1],
}),
data_format="record",
input_col1="col1",
input_col2="col2",
)
# Should handle division by zero gracefully
assert result["result"].null_count() > 0
def test_my_metric_summary(self):
"""Test with summary-level data."""
result = my_metric(
data=pl.DataFrame({
"sum_col1": [10, 20],
"mean_col2": [2.0, 4.0],
}),
data_format="summary",
sum_col1="sum_col1",
mean_col2="mean_col2",
)
assert result.shape == (1, 4)
def test_my_metric_validation_errors(self):
"""Test that appropriate validation errors are raised."""
with pytest.raises(ValidationError):
my_metric(
data=pl.DataFrame({
"col1": ["a", "b", "c"], # Non-numeric data
"col2": [1, 2, 3],
}),
data_format="record",
input_col1="col1",
input_col2="col2",
)
Step 7: Add Documentation¶
Create documentation in the appropriate API file:
- API Documentation: Add to the relevant file in
docs/api/(e.g.,docs/api/accuracy.md) - Recipe Examples: Add practical examples in
docs/recipes/ - Update the index: Ensure your new metric is listed in the appropriate index
Step 8: Validate Your Implementation¶
Run these commands to ensure your metric is properly implemented:
# Run tests for your specific metric
uv run pytest tests/core/metrics/test_{your_module}.py::TestYourMetric -v
# Run all tests to ensure no regressions
uv run pytest
# Run type checking (if available)
uv run mypy tnp_statistic_library/_core/metrics/{your_module}.py
# Run linting
uv run ruff check tnp_statistic_library/_core/metrics/{your_module}.py
# Test documentation builds
uv run mkdocs serve
You can also use the just command runner to run these commands easily:
Common Patterns and Best Practices¶
1. Handling Edge Cases¶
Always handle these scenarios:
- Division by zero using
pl.when(denominator != 0).then(numerator / denominator).otherwise(None) - Null/missing values
- Empty datasets
- Insufficient sample sizes
- Invalid input ranges
2. Column Validation¶
Use appropriate validators for input columns:
IsNumeric(): Ensures column contains numeric dataPositiveNumber(): Ensures positive numeric valuesInclusiveRange(min, max): Ensures values fall within rangeIsIndicator(): Ensures binary indicator values (0/1)
3. Segmentation Support¶
All metrics support segmentation by default. The segment parameter allows users to group results by specified columns.
4. LazyFrame Support¶
All calculations use Polars LazyFrames for performance. Your compute method should return a LazyFrame that can be collected when needed.
5. Naming Conventions¶
- Metric names: Use lowercase with underscores (e.g.,
my_custom_metric) - Configuration classes: Use descriptive names ending with
Config - Helper functions: Use descriptive names with
calculate_prefix - Test classes: Use
Test{MetricName}format
Example: Real Metric Implementation¶
Here's how the existing mean metric is implemented as a reference:
# From tnp_statistic_library/_core/metrics/summary/mean.py
import polars as pl
from typing import Annotated, Literal
from tnp_statistic_library._core.metrics._common import _formula_agg
from tnp_statistic_library._core.metrics.summary.mean import mean as _mean, mean_formula
from tnp_statistic_library._core.registry import (
MetricConfigBase,
Numeric,
Segment,
register_config,
)
@register_config
class MeanConfig(MetricConfigBase):
"""Configuration for computing the mean statistic."""
type: Literal["mean"] = "mean"
data_format: Literal["record"] = "record"
variable: Annotated[str, Numeric()]
segment: Annotated[list[str] | None, Segment()] = None
def expressions(self) -> dict[str, pl.Expr]:
return {
"variable_name": pl.lit(self.variable),
"mean_value": pl.col(self.variable).mean(),
}
def compute(self, lf: pl.LazyFrame, segment: list[str] | None) -> pl.LazyFrame:
return _formula_agg(lf, segment, mean_formula(**self.expressions()))
# From tnp_statistic_library/metrics/summary.py
def mean(
data: pl.LazyFrame | pl.DataFrame,
*,
variable: str,
segment: list[str] | None = None,
) -> pl.DataFrame:
"""Calculate the mean of a variable."""
return _mean(data, variable=variable, segment=segment).collect()
This comprehensive guide should help you successfully add new metrics while maintaining consistency with the existing codebase. If you have questions or need clarification on any step, please open an issue on GitHub (recommended) or add a post on the Teams Channel