Data Reference¶
 mmm_eval.data ¶
 Data loading and processing utilities.
Classes¶
 DataLoader(data_path: str | Path) ¶
 Simple data loader for MMM evaluation.
Takes a data path and loads the data.
Initialize data loader with data path.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| data_path | str | Path | Path to the data file (CSV, Parquet, etc.) | required | 
Raises:
| Type | Description | 
|---|---|
| FileNotFoundError | If the data file does not exist. | 
Source code in mmm_eval/data/loaders.py
 Functions¶
 load() -> pd.DataFrame ¶
 Load data from the specified path.
Returns Loaded DataFrame
Raises ValueError: If the file format is not supported.
Source code in mmm_eval/data/loaders.py
  DataPipeline(data: pd.DataFrame, framework: str, control_columns: list[str] | None, channel_columns: list[str], date_column: str, response_column: str, revenue_column: str, min_number_observations: int = DataPipelineConstants.MIN_NUMBER_OBSERVATIONS) ¶
 Data pipeline that orchestrates loading, processing, and validation.
Provides a simple interface to go from raw data file to validated DataFrame.
Initialize data pipeline.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| data | DataFrame | DataFrame containing the data | required | 
| framework | str | name of supported framework | required | 
| control_columns | list[str] | None | List of control columns | required | 
| channel_columns | list[str] | List of channel columns | required | 
| date_column | str | Name of the date column | required | 
| response_column | str | Name of the response column | required | 
| revenue_column | str | Name of the revenue column | required | 
| min_number_observations | int | Minimum required number of observations | MIN_NUMBER_OBSERVATIONS | 
Source code in mmm_eval/data/pipeline.py
 Functions¶
 run() -> pd.DataFrame ¶
 Run the complete data pipeline: process → validate.
Returns Validated and processed DataFrame
Raises Various exceptions processing or validation steps
Source code in mmm_eval/data/pipeline.py
  DataProcessor(control_columns: list[str] | None, channel_columns: list[str], date_column: str = InputDataframeConstants.DATE_COL, response_column: str = InputDataframeConstants.RESPONSE_COL, revenue_column: str = InputDataframeConstants.MEDIA_CHANNEL_REVENUE_COL) ¶
 Simple data processor for MMM evaluation.
Handles data transformations like datetime casting, column renaming, etc.
Initialize data processor.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| control_columns | list[str] | None | List of control columns | required | 
| channel_columns | list[str] | List of channel columns | required | 
| date_column | str | Name of the date column to parse and rename | DATE_COL | 
| response_column | str | Name of the response column to parse and rename | RESPONSE_COL | 
| revenue_column | str | Name of the revenue column to parse and rename | MEDIA_CHANNEL_REVENUE_COL | 
Source code in mmm_eval/data/processor.py
 Functions¶
 process(df: pd.DataFrame) -> pd.DataFrame ¶
 Process the DataFrame with configured transformations.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df | DataFrame | Input DataFrame | required | 
Returns:
| Type | Description | 
|---|---|
| DataFrame | Processed DataFrame | 
Raises:
| Type | Description | 
|---|---|
| MissingRequiredColumnsError | If the required columns are not present. | 
| InvalidDateFormatError | If the date column cannot be parsed. | 
Source code in mmm_eval/data/processor.py
  DataValidator(framework: str, date_column: str, response_column: str, revenue_column: str, control_columns: list[str] | None, min_number_observations: int = DataPipelineConstants.MIN_NUMBER_OBSERVATIONS) ¶
 Validator for MMM data with configurable validation rules.
Initialize validator with validation rules.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| framework | str | a supported framework, one of  | required | 
| date_column | str | Name of the date column | required | 
| response_column | str | Name of the response column | required | 
| revenue_column | str | Name of the revenue column | required | 
| control_columns | list[str] | None | List of control columns | required | 
| min_number_observations | int | Minimum required number of observations for time series CV | MIN_NUMBER_OBSERVATIONS | 
Source code in mmm_eval/data/validation.py
 Functions¶
 run_validations(df: pd.DataFrame) -> None ¶
 Run all validations on the DataFrame.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df | DataFrame | Input DataFrame | required | 
Returns:
| Type | Description | 
|---|---|
| None | Validation result with all errors and warnings | 
Source code in mmm_eval/data/validation.py
 Functions¶
 generate_meridian_data() ¶
 Load and process a Meridian-compatible dataset for E2E testing.
Returns DataFrame containing Meridian-compatible data with media channels, controls, and response variables
Source code in mmm_eval/data/synth_data_generator.py
  generate_pymc_data() ¶
 Generate synthetic MMM data for testing purposes.
Returns DataFrame containing synthetic MMM data with media channels, controls, and response variables
Source code in mmm_eval/data/synth_data_generator.py
 | 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |  | 
Modules¶
 constants ¶
   exceptions ¶
 Custom exceptions for data validation and processing.
Classes¶
 DataValidationError ¶
  Bases: Exception
Raised when data validation fails.
 EmptyDataFrameError ¶
  Bases: Exception
Raised when DataFrame is empty.
 InvalidDateFormatError ¶
  Bases: Exception
Raised when date parsing fails.
 MissingRequiredColumnsError ¶
  Bases: Exception
Raised when required columns are missing.
 ValidationError ¶
  Bases: Exception
Base class for validation errors.
 loaders ¶
 Data loading utilities for MMM evaluation.
Classes¶
 DataLoader(data_path: str | Path) ¶
 Simple data loader for MMM evaluation.
Takes a data path and loads the data.
Initialize data loader with data path.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| data_path | str | Path | Path to the data file (CSV, Parquet, etc.) | required | 
Raises:
| Type | Description | 
|---|---|
| FileNotFoundError | If the data file does not exist. | 
Source code in mmm_eval/data/loaders.py
 load() -> pd.DataFrame ¶Load data from the specified path.
Returns Loaded DataFrame
Raises ValueError: If the file format is not supported.
Source code in mmm_eval/data/loaders.py
  pipeline ¶
 Data pipeline for MMM evaluation.
Classes¶
 DataPipeline(data: pd.DataFrame, framework: str, control_columns: list[str] | None, channel_columns: list[str], date_column: str, response_column: str, revenue_column: str, min_number_observations: int = DataPipelineConstants.MIN_NUMBER_OBSERVATIONS) ¶
 Data pipeline that orchestrates loading, processing, and validation.
Provides a simple interface to go from raw data file to validated DataFrame.
Initialize data pipeline.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| data | DataFrame | DataFrame containing the data | required | 
| framework | str | name of supported framework | required | 
| control_columns | list[str] | None | List of control columns | required | 
| channel_columns | list[str] | List of channel columns | required | 
| date_column | str | Name of the date column | required | 
| response_column | str | Name of the response column | required | 
| revenue_column | str | Name of the revenue column | required | 
| min_number_observations | int | Minimum required number of observations | MIN_NUMBER_OBSERVATIONS | 
Source code in mmm_eval/data/pipeline.py
 run() -> pd.DataFrame ¶Run the complete data pipeline: process → validate.
Returns Validated and processed DataFrame
Raises Various exceptions processing or validation steps
Source code in mmm_eval/data/pipeline.py
  processor ¶
 Data processing utilities for MMM evaluation.
Classes¶
 DataProcessor(control_columns: list[str] | None, channel_columns: list[str], date_column: str = InputDataframeConstants.DATE_COL, response_column: str = InputDataframeConstants.RESPONSE_COL, revenue_column: str = InputDataframeConstants.MEDIA_CHANNEL_REVENUE_COL) ¶
 Simple data processor for MMM evaluation.
Handles data transformations like datetime casting, column renaming, etc.
Initialize data processor.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| control_columns | list[str] | None | List of control columns | required | 
| channel_columns | list[str] | List of channel columns | required | 
| date_column | str | Name of the date column to parse and rename | DATE_COL | 
| response_column | str | Name of the response column to parse and rename | RESPONSE_COL | 
| revenue_column | str | Name of the revenue column to parse and rename | MEDIA_CHANNEL_REVENUE_COL | 
Source code in mmm_eval/data/processor.py
 process(df: pd.DataFrame) -> pd.DataFrame ¶Process the DataFrame with configured transformations.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df | DataFrame | Input DataFrame | required | 
Returns:
| Type | Description | 
|---|---|
| DataFrame | Processed DataFrame | 
Raises:
| Type | Description | 
|---|---|
| MissingRequiredColumnsError | If the required columns are not present. | 
| InvalidDateFormatError | If the date column cannot be parsed. | 
Source code in mmm_eval/data/processor.py
  schemas ¶
   synth_data_generator ¶
 Generate synthetic data for testing.
Based on: https://www.pymc-marketing.io/en/stable/notebooks/mmm/mmm_example.html
Functions¶
 generate_meridian_data() ¶
 Load and process a Meridian-compatible dataset for E2E testing.
Returns DataFrame containing Meridian-compatible data with media channels, controls, and response variables
Source code in mmm_eval/data/synth_data_generator.py
  generate_pymc_data() ¶
 Generate synthetic MMM data for testing purposes.
Returns DataFrame containing synthetic MMM data with media channels, controls, and response variables
Source code in mmm_eval/data/synth_data_generator.py
 | 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |  | 
 validation ¶
 Data validation for MMM evaluation.
Classes¶
 DataValidator(framework: str, date_column: str, response_column: str, revenue_column: str, control_columns: list[str] | None, min_number_observations: int = DataPipelineConstants.MIN_NUMBER_OBSERVATIONS) ¶
 Validator for MMM data with configurable validation rules.
Initialize validator with validation rules.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| framework | str | a supported framework, one of  | required | 
| date_column | str | Name of the date column | required | 
| response_column | str | Name of the response column | required | 
| revenue_column | str | Name of the revenue column | required | 
| control_columns | list[str] | None | List of control columns | required | 
| min_number_observations | int | Minimum required number of observations for time series CV | MIN_NUMBER_OBSERVATIONS | 
Source code in mmm_eval/data/validation.py
 run_validations(df: pd.DataFrame) -> None ¶Run all validations on the DataFrame.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df | DataFrame | Input DataFrame | required | 
Returns:
| Type | Description | 
|---|---|
| None | Validation result with all errors and warnings |