Testing¶
This guide covers testing practices and procedures for the mmm-eval project.
Testing Philosophy¶
We follow these testing principles:
- Comprehensive coverage: Aim for high test coverage across all modules
- Fast feedback: Tests should run quickly to enable rapid development
- Reliable: Tests should be deterministic and not flaky
- Maintainable: Tests should be easy to understand and modify
- Realistic: Tests should reflect real-world usage patterns
Test Structure¶
The test suite is organized as follows:
tests/
├── test_adapters/ # Framework adapter tests
├── test_configs/ # Configuration object tests
├── test_core/ # Core functionality tests
├── test_data/ # Data handling tests
└── test_validation_tests/ # Metrics calculation tests
Running Tests¶
Basic Test Execution¶
# Run all tests
poetry run pytest
# Run tests with verbose output
poetry run pytest -v
# Run tests with coverage
poetry run pytest --cov=mmm_eval
# Run tests in parallel
poetry run pytest -n auto
Running Specific Test Categories¶
# Run only unit tests
poetry run pytest tests/unit/
# Run only integration tests
poetry run pytest tests/integration/
# Run tests for a specific module
poetry run pytest tests/unit/test_core/
# Run tests matching a pattern
poetry run pytest -k "test_accuracy"
Running Tests with Markers¶
# Run integration tests only
poetry run pytest -m integration
# Run slow tests only
poetry run pytest -m slow
# Skip slow tests
poetry run pytest -m "not slow"
Test Types¶
Unit Tests¶
Unit tests verify individual functions and classes in isolation. They should:
- Test one specific behavior or functionality
- Use mocks for external dependencies
- Be fast and deterministic
- Have clear, descriptive names
Example unit test:
def test_calculate_mape_returns_correct_value():
"""Test that MAPE calculation returns expected results."""
actual = [100, 200, 300]
predicted = [110, 190, 310]
result = calculate_mape(actual, predicted)
expected = 10.0 # 10% average error
assert result == pytest.approx(expected, rel=1e-2)
Integration Tests¶
Integration tests verify that multiple components work together correctly. They:
- Test the interaction between different modules
- Use real data and minimal mocking
- May take longer to run
- Are marked with the
@pytest.mark.integration
decorator
Example integration test:
@pytest.mark.integration
def test_pymc_marketing_evaluation_workflow():
"""Test complete PyMC Marketing evaluation workflow."""
# Setup test data
data = load_test_data()
# Run evaluation
result = evaluate_framework(
data=data,
framework="pymc-marketing",
config=test_config
)
# Verify results
assert result.accuracy > 0.8
assert result.cross_validation_score > 0.7
assert result.refresh_stability > 0.6
Test Data and Fixtures¶
Using Fixtures¶
Pytest fixtures provide reusable test data and setup:
@pytest.fixture
def sample_mmm_data():
"""Provide sample MMM data for testing."""
return pd.DataFrame({
'date': pd.date_range('2023-01-01', periods=100),
'sales': np.random.normal(1000, 100, 100),
'tv_spend': np.random.uniform(0, 1000, 100),
'radio_spend': np.random.uniform(0, 500, 100),
'digital_spend': np.random.uniform(0, 800, 100)
})
def test_data_validation(sample_mmm_data):
"""Test data validation with sample data."""
validator = DataValidator()
result = validator.validate(sample_mmm_data)
assert result.is_valid
Test Data Management¶
- Use realistic but synthetic data
- Keep test data files small and focused
- Document the structure and purpose of test data
Mocking and Stubbing¶
When to Mock¶
Mock external dependencies to:
- Speed up tests
- Avoid network calls
- Control test conditions
- Test error scenarios
Mocking Examples¶
from unittest.mock import Mock, patch
def test_api_call_with_mock():
"""Test API call with mocked response."""
with patch('requests.get') as mock_get:
mock_get.return_value.json.return_value = {'status': 'success'}
mock_get.return_value.status_code = 200
result = fetch_data_from_api()
assert result['status'] == 'success'
mock_get.assert_called_once()
Test Coverage¶
Coverage Goals¶
- Minimum coverage: 80% for all modules
- Target coverage: 90% for critical modules
- Critical modules: Core evaluation logic, data validation, metrics calculation
Coverage Reports¶
# Generate HTML coverage report
poetry run pytest --cov=mmm_eval --cov-report=html
# Generate XML coverage report (for CI)
poetry run pytest --cov=mmm_eval --cov-report=xml
# View coverage summary
poetry run pytest --cov=mmm_eval --cov-report=term-missing
Coverage Configuration¶
Configure coverage in pyproject.toml
:
[tool.coverage.run]
source = ["mmm_eval"]
omit = [
"*/tests/*",
"*/test_*",
"*/__pycache__/*"
]
[tool.coverage.report]
exclude_lines = [
"pragma: no cover",
"def __repr__",
"if self.debug:",
"if settings.DEBUG",
"raise AssertionError",
"raise NotImplementedError",
"if 0:",
"if __name__ == .__main__.:",
"class .*\\bProtocol\\):",
"@(abc\\.)?abstractmethod"
]
Performance Testing¶
Benchmark Tests¶
For performance-critical code, use benchmark tests:
def test_mape_calculation_performance(benchmark):
"""Benchmark MAPE calculation performance."""
actual = np.random.normal(1000, 100, 10000)
predicted = np.random.normal(1000, 100, 10000)
result = benchmark(lambda: calculate_mape(actual, predicted))
assert result > 0
Memory Usage Tests¶
Monitor memory usage in tests:
import psutil
import os
def test_memory_usage():
"""Test that operations don't use excessive memory."""
process = psutil.Process(os.getpid())
initial_memory = process.memory_info().rss
# Run memory-intensive operation
result = process_large_dataset()
final_memory = process.memory_info().rss
memory_increase = final_memory - initial_memory
# Memory increase should be reasonable (< 100MB)
assert memory_increase < 100 * 1024 * 1024
Continuous Integration¶
GitHub Actions¶
Tests run automatically on:
- Every pull request
- Every push to main branch
- Scheduled runs (nightly)
CI Configuration¶
The CI pipeline includes:
- Linting: Code style and quality checks
- Type checking: Static type analysis
- Unit tests: Fast feedback on basic functionality
- Integration tests: Verify component interactions
- Coverage reporting: Track test coverage trends
Pre-commit Hooks¶
Install pre-commit hooks to catch issues early:
# Install pre-commit
poetry add --group dev pre-commit
# Install hooks
pre-commit install
# Run all hooks
pre-commit run --all-files
Debugging Tests¶
Verbose Output¶
# Run with maximum verbosity
poetry run pytest -vvv
# Show local variables on failures
poetry run pytest -l
# Stop on first failure
poetry run pytest -x
Debugging with pdb¶
def test_debug_example():
"""Example of using pdb for debugging."""
import pdb; pdb.set_trace() # Breakpoint
result = complex_calculation()
assert result > 0
Test Isolation¶
Ensure tests don't interfere with each other:
@pytest.fixture(autouse=True)
def reset_global_state():
"""Reset global state before each test."""
# Setup
yield
# Teardown
cleanup_global_state()
Best Practices¶
Test Naming¶
- Use descriptive test names that explain the expected behavior
- Follow the pattern:
test_[function]_[scenario]_[expected_result]
- Include edge cases and error conditions
Test Organization¶
- Group related tests in classes
- Use fixtures for common setup
- Keep tests focused and single-purpose
Assertions¶
- Use specific assertions (
assert result == expected
) - Avoid complex logic in assertions
- Use appropriate assertion methods (
assertIn
,assertRaises
, etc.)
Test Data¶
- Use realistic test data
- Avoid hardcoded magic numbers
- Document test data assumptions
Documentation¶
- Write clear docstrings for test functions
- Explain complex test scenarios
- Document test data sources and assumptions
Common Pitfalls¶
Flaky Tests¶
Avoid flaky tests by:
- Not relying on timing or external services
- Using deterministic random seeds
- Properly mocking external dependencies
- Avoiding shared state between tests
Slow Tests¶
Keep tests fast by:
- Using appropriate mocks
- Minimizing I/O operations
- Using efficient test data
- Running tests in parallel when possible
Over-Mocking¶
Don't over-mock:
- Test the actual behavior, not the implementation
- Mock only external dependencies
- Use real objects when possible
Getting Help¶
If you encounter testing issues:
- Check the pytest documentation
- Review existing tests for examples
- Ask questions in project discussions
- Consult the Contributing Guide