Configuration¶
mmm-eval uses framework-specific configurations to control model parameters, fitting settings, and data mappings. This guide explains how to create and use configurations for the PyMC-Marketing framework.
Creating Configurations¶
There are two ways to create a configuration (config):
- From a model object (preferred).
Info
See the Meridian notebook in the examples/
directory for a walkthrough of how to do this with Meridian.
from pymc_marketing.mmm import MMM, GeometricAdstock, LogisticSaturation
from mmm_eval.configs import PyMCConfig
# Create a PyMC model
model = MMM(
date_column="date_week",
channel_columns=["channel_1", "channel_2"],
adstock=GeometricAdstock(l_max=4),
saturation=LogisticSaturation(),
yearly_seasonality=2
)
# Create configuration
config = PyMCConfig.from_model_object(
model_object=model,
fit_kwargs={"target_accept": 0.9, "draws": 100, "chains": 2},
revenue_column="revenue",
response_column="quantity"
)
# Save to JSON if you want
config.save_model_object_to_json("./", "my_config")
Meridian Config Creation
Option 2 is not supported for Meridian models due to the complexity of the model objects.
{
"pymc_model_config": {
"date_column": "date_week",
"channel_columns": ["channel_1", "channel_2"],
"adstock": "GeometricAdstock(l_max=4)",
"saturation": "LogisticSaturation()",
},
"fit_config": {
"target_accept": 0.9,
"chains": 2,
},
"revenue_column": "revenue",
}
If you want to run mmm-eval from the CLI, you will need to save the config saved to a JSON file. We recommend Option 1 above to avoid any errors related to improper stringifiation of the model object in the manual approach.
Using a Configuration¶
If you have the config created from the model object, you can pass that directly to the evaluation suite (see Quick Start).
Alternately, if you have the config saved to a JSON, you can pass the filepath via the CLI.
mmm-eval --input-data-path data.csv --framework pymc-marketing --config-path config.json --output-path results/
If you have a saved config and you're in a notebook, you can load the config from the path, then run the evaluation.
new_config = PyMCConfig.load_model_config_from_json("path/to/config.json")
results = run_evaluation(new_config, ...)
PyMC-Marketing Configuration Structure¶
A PyMC-Marketing configuration file has the following structure:
{
"pymc_model_config": {
"date_column": "date_week",
"channel_columns": ["channel_1", "channel_2"],
"adstock": "GeometricAdstock(l_max=4)",
"saturation": "LogisticSaturation()",
},
"fit_config": {
// Optional
},
"revenue_column": "revenue",
}
Model Configuration (pymc_model_config)¶
The pymc_model_config
section defines the PyMC model structure and parameters:
Required Fields¶
date_column¶
The column name containing the date/time variable.
channel_columns¶
List of column names for marketing channels (media spend).
adstock¶
The adstock transformation to apply to media channels. Must be a valid member of the AdstockTransformation class.
saturation¶
The saturation transformation to apply to media channels. Must be valid member of the SaturationTransformation class.
Optional Fields¶
The set of optional inputs matches the optional inputs to the MMM class in PyMC.
Fit Configuration (fit_config)¶
The fit_config
section defines the MCMC sampling parameters. These parameters will be passed to .fit()
. Note, we do not require the X
and y
inputs as we derive those from the data you provide. Therefore, all parameters in this config are optional.
Sampling Parameters¶
draws¶
Number of posterior samples to draw.
tune¶
Number of tuning (warm-up) steps.
chains¶
Number of MCMC chains to run.
target_accept¶
Target acceptance rate for the sampler (0.0 to 1.0).
random_seed¶
Random seed for reproducibility.
progress_bar¶
Whether to display the progress bar (default: false).
return_inferencedata¶
Whether to return arviz.InferenceData (default: false).
Data Mapping Configuration¶
revenue_column¶
The column name containing revenue data for ROI calculations.
response_column¶
The column name containing the target variable (optional, defaults to revenue if not provided).
Meridian Configuration Structure¶
A Meridian configuration file has the following structure:
{
"input_data_builder_config": {
"date_column": "date_week",
"media_channels": ["tv", "digital", "print"],
"channel_spend_columns": ["tv_spend", "digital_spend", "print_spend"],
"response_column": "sales"
},
"model_spec_config": {
"prior": "PriorDistribution(...)",
"media_effects_dist": "log_normal",
"max_lag": 8
},
"sample_posterior_config": {
"n_chains": 4,
"n_keep": 1000
},
"revenue_column": "revenue"
}
Input Data Builder Configuration (input_data_builder_config)¶
The input_data_builder_config
section defines how to construct the data object for the Meridian model:
Required Fields¶
date_column¶
The column name containing the date/time variable.
media_channels¶
List of media channel names (not column names).
channel_spend_columns¶
List of column names for media channel spend variables.
{
"input_data_builder_config": {
"channel_spend_columns": ["tv_spend", "digital_spend", "print_spend"]
}
}
response_column¶
The column name containing the target variable.
Optional Fields¶
channel_impressions_columns¶
List of column names for media channel impressions variables.
{
"input_data_builder_config": {
"channel_impressions_columns": ["tv_impressions", "digital_impressions", "print_impressions"]
}
}
Mutual Exclusion
channel_impressions_columns
and channel_reach_columns
cannot both be provided.
channel_reach_columns¶
List of column names for media channel reach variables.
{
"input_data_builder_config": {
"channel_reach_columns": ["tv_reach", "digital_reach", "print_reach"]
}
}
Paired Fields
channel_reach_columns
and channel_frequency_columns
must be provided together (both or neither).
channel_frequency_columns¶
List of column names for media channel frequency variables.
{
"input_data_builder_config": {
"channel_frequency_columns": ["tv_frequency", "digital_frequency", "print_frequency"]
}
}
organic_media_columns¶
List of column names for organic media variables.
Paired Fields
organic_media_columns
and organic_media_channels
must be provided together (both or neither).
organic_media_channels¶
List of channel names for organic media variables.
non_media_treatment_columns¶
List of column names for non-media treatment variables.
{
"input_data_builder_config": {
"non_media_treatment_columns": ["price_promotion", "seasonal_event"]
}
}
control_columns¶
List of column names for control variables.
{
"input_data_builder_config": {
"control_columns": ["price", "competitor_activity", "economic_indicator"]
}
}
Model Specification Configuration (model_spec_config)¶
The model_spec_config
section defines the Meridian model structure and parameters:
Required Fields¶
prior¶
The prior distribution configuration. This is a complex object that defines the prior distributions for model parameters.
Optional Fields¶
The set of optional fields matches the optional inputs to the Meridian ModelSpec object.
Sample Posterior Configuration (sample_posterior_config)¶
The sample_posterior_config
section defines the MCMC sampling parameters for Meridian models:
Sampling Parameters¶
n_chains¶
Number of MCMC chains to run (default: 4).
n_adapt¶
Number of adaptation steps (default: 500).
n_burnin¶
Number of burn-in steps (default: 500).
n_keep¶
Number of posterior samples to keep (default: 1000).
Other optional arguments match the API of the Meridian.sample_posterior function
Example Configurations (PyMC-marketing)¶
Minimal Configuration¶
{
"pymc_model_config": {
"date_column": "date",
"channel_columns": ["tv_spend", "digital_spend"],
"adstock": "GeometricAdstock(l_max=4)",
"saturation": "LogisticSaturation()"
},
"fit_config": {
"target_accept": 0.9,
"draws": 100,
"tune": 50,
"chains": 2,
"random_seed": 42
},
"revenue_column": "revenue"
}
Advanced Configuration¶
{
"pymc_model_config": {
"date_column": "date_week",
"channel_columns": ["tv_spend", "digital_spend", "print_spend", "radio_spend"],
"control_columns": ["price", "seasonality", "holiday", "competitor_promo"],
"adstock": "WeibullAdstock(l_max=6)",
"saturation": "HillSaturation()",
"yearly_seasonality": 4,
"time_varying_intercept": true,
"time_varying_media": false
},
"fit_config": {
"target_accept": 0.95,
"draws": 2000,
"tune": 1000,
"chains": 4,
"random_seed": 123,
"progress_bar": true,
"return_inferencedata": true
},
"revenue_column": "revenue",
"response_column": "sales"
}
Example Configurations (Meridian)¶
Warning
As stated above, we strongly suggest you create the Meridian config objects programatically to ensure the config can be de-seralized properly.
Minimal Configuration¶
{
"input_data_builder_config": {
"date_column": "date_week",
"media_channels": ["tv", "digital"],
"channel_spend_columns": ["tv_spend", "digital_spend"],
"response_column": "sales"
},
"model_spec_config": {
"prior": "PriorDistribution(roi_m=LogNormal(0.2, 0.9))",
"media_effects_dist": "log_normal",
"max_lag": 8
},
"sample_posterior_config": {
"n_chains": 4,
"n_keep": 1000,
"n_adapt": 500,
"n_burnin": 500
},
"revenue_column": "revenue"
}
Advanced Configuration¶
{
"input_data_builder_config": {
"date_column": "date_week",
"media_channels": ["tv", "digital", "print", "radio"],
"channel_spend_columns": ["tv_spend", "digital_spend", "print_spend", "radio_spend"],
"channel_reach_columns": ["tv_reach", "digital_reach", "print_reach", "radio_reach"],
"channel_frequency_columns": ["tv_frequency", "digital_frequency", "print_frequency", "radio_frequency"],
"organic_media_columns": ["organic_search", "direct_traffic"],
"organic_media_channels": ["organic_search", "direct_traffic"],
"non_media_treatment_columns": ["price_promotion", "seasonal_event"],
"control_columns": ["price", "competitor_activity", "economic_indicator"],
"response_column": "sales"
},
"model_spec_config": {
"prior": "PriorDistribution(roi_m=LogNormal(0.2, 0.9), roi_rf=LogNormal(0.1, 0.5))",
"media_effects_dist": "log_normal",
"hill_before_adstock": true,
"max_lag": 12,
"unique_sigma_for_each_geo": true,
"media_prior_type": "roi",
"rf_prior_type": "roi",
"paid_media_prior_type": "roi",
"roi_calibration_period": [0.8, 1.2],
"rf_roi_calibration_period": [0.9, 1.1],
"organic_media_prior_type": "contribution",
"organic_rf_prior_type": "contribution",
"non_media_treatments_prior_type": "contribution",
"non_media_baseline_values": [1.0, 0.5],
"knots": 10,
"baseline_geo": "US",
"holdout_id": [1, 2, 3],
"control_population_scaling_id": [1, 2],
"non_media_population_scaling_id": [1, 2]
},
"sample_posterior_config": {
"n_chains": 4,
"n_adapt": 1000,
"n_burnin": 1000,
"n_keep": 2000,
"max_tree_depth": 12,
"max_energy_diff": 1000.0,
"unrolled_leapfrog_steps": 2,
"parallel_iterations": 20,
"seed": 123
},
"revenue_column": "revenue"
}
Configuration Best Practices¶
Sampling Parameters¶
- More draws: Use 1000+ draws for production models
- Multiple chains: Use 2-4 chains for reliable convergence
- Adequate tuning: Set
tune
to 50-100% of draws for complex models - Acceptance rate: Target 0.9-0.95 for optimal sampling efficiency
Performance Considerations¶
- Data size: Larger datasets require more sampling iterations
- Model complexity: More parameters increase computation time
- Hardware: More CPU cores can speed up multi-chain sampling
Configuration Validation¶
mmm-eval validates your configuration file and will raise errors for:
- Missing required fields
- Invalid field types
- Unsupported adstock or saturation functions
- Invalid parameter ranges
Next Steps¶
- Learn about Data for different data structures
- Explore Examples for configuration use cases
- Check the CLI Reference for command-line options