Marketing Mix Modeling – Landeros Labs

Marketing Mix Modeling (MMM) explanation and implementation

Marketing Mix Modeling (MMM) is a powerful technique used to measure the effectiveness of various marketing activities and understand how they contribute to sales or other key performance indicators (KPIs). In this blog post, we’ll explore MMM from a machine learning perspective, providing in-depth explanations, code examples in Python, and insightful visualizations.

What is MMM?

MMM uses statistical models to quantify the impact of different marketing channels on sales. It helps answer questions like:

What is the return on investment (ROI) of each marketing channel?
How do different channels interact with each other?
What is the optimal allocation of marketing budget across channels?

Data and Methodology

For this project, we’ll use a simulated dataset containing daily marketing spend for TV, radio, and social media advertising, along with the corresponding sales figures. We’ll then build a regression model to analyze the relationship between marketing activities and sales.

Note: While this example uses simulated data, the principles and techniques apply to real-world marketing datasets.

1. Import Libraries

import pandas as pd
import numpy as np
import altair as alt
import statsmodels.formula.api as sm

2. Generate and Prepare the Data

# Create a dataframe with the following columns:
# `Date`, `TV`, `Radio`, `Social_Media`, `Sales`
df = pd.DataFrame({
    "Date": pd.to_datetime(["2023-01-01", "2023-01-02", "2023-01-03", "2023-01-04", "2023-01-05", "2023-01-06", "2023-01-07", "2023-01-08", "2023-01-09", "2023-01-10", "2023-01-11", "2023-01-12", "2023-01-13", "2023-01-14", "2023-01-15"]),
    "TV": [1000, 1200, 1100, 1300, 1400, 1500, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100],
    "Radio": [500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200],
    "Social_Media": [250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, 550, 575, 600],
    "Sales": [2500, 2700, 2900, 3100, 3300, 3500, 3700, 3900, 4100, 4300, 4500, 4700, 4900, 5100, 5300]
})

# Calculate the 7-day rolling averages for the `TV`, `Radio`, and `Social_Media` columns
df["TV_7d_avg"] = df["TV"].rolling(window=7, center=True).mean()
df["Radio_7d_avg"] = df["Radio"].rolling(window=7, center=True).mean()
df["Social_Media_7d_avg"] = df["Social_Media"].rolling(window=7, center=True).mean()

# Calculate the 7-day rolling averages for the `Sales` column
df["Sales_7d_avg"] = df["Sales"].rolling(window=7, center=True).mean()

3. Visualize the Data

I haven’t used Altair plotting library before, so I will try to use it in this project.


# Plot the 7-day rolling averages for `TV`, `Radio`, and `Social_Media` advertising spends over time
base = alt.Chart(df).encode(x=alt.X('Date', axis=alt.Axis(title='Date')))

# Plot the 7-day rolling averages for `TV`
tv_plot = base.mark_line(color='blue').encode(
    y=alt.Y('TV_7d_avg', axis=alt.Axis(title='TV 7-day Avg')),
    tooltip=['Date', 'TV_7d_avg']
).properties(title='7-Day Rolling Average of TV Advertising Spend')

# Plot the 7-day rolling averages for `Radio`
radio_plot = base.mark_line(color='red').encode(
    y=alt.Y('Radio_7d_avg', axis=alt.Axis(title='Radio 7-day Avg')),
    tooltip=['Date', 'Radio_7d_avg']
).properties(title='7-Day Rolling Average of Radio Advertising Spend')

# Plot the 7-day rolling averages for `Social_Media`
social_media_plot = base.mark_line(color='green').encode(
    y=alt.Y('Social_Media_7d_avg', axis=alt.Axis(title='Social Media 7-day Avg')),
    tooltip=['Date', 'Social_Media_7d_avg']
).properties(title='7-Day Rolling Average of Social Media Advertising Spend')

# Combine the plots
combined_plot = tv_plot + radio_plot + social_media_plot
combined_plot.display()

# Plot the 7-day rolling averages for `Sales` over time
sales_plot = base.mark_line().encode(
    y=alt.Y('Sales_7d_avg', axis=alt.Axis(title='Sales 7-day Avg')),
    tooltip=['Date', 'Sales_7d_avg']
).properties(title='7-Day Rolling Average of Sales')
sales_plot.display()

# Create a scatter plot of `TV` spend against `Sales`
scatter_tv = alt.Chart(df).mark_point().encode(
    x=alt.X('TV', axis=alt.Axis(title='TV Spend')),
    y=alt.Y('Sales', axis=alt.Axis(title='Sales')),
    tooltip=['TV', 'Sales']
).properties(title='TV Spend vs. Sales')
scatter_tv.display()

# Create a scatter plot of `Radio` spend against `Sales`
scatter_radio = alt.Chart(df).mark_point().encode(
    x=alt.X('Radio', axis=alt.Axis(title='Radio Spend')),
    y=alt.Y('Sales', axis=alt.Axis(title='Sales')),
    tooltip=['Radio', 'Sales']
).properties(title='Radio Spend vs. Sales')
scatter_radio.display()

# Create a scatter plot of `Social_Media` spend against `Sales`
scatter_social_media = alt.Chart(df).mark_point().encode(
    x=alt.X('Social_Media', axis=alt.Axis(title='Social Media Spend')),
    y=alt.Y('Sales', axis=alt.Axis(title='Sales')),
    tooltip=['Social_Media', 'Sales']
).properties(title='Social Media Spend vs. Sales')
scatter_social_media.display()

Save as SVG Save as PNG View Source View Compiled Vega Open in Vega Editor

4. Build and Evaluate the Model

Fit a regression model with Sales as the outcome and TV, Radio, and Social_Media as predictors

# Fit a regression model 
model = sm.ols("Sales ~ TV + Radio + Social_Media", data=df).fit()

# Print the model summary
print(model.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  Sales   R-squared:                       1.000
Model:                            OLS   Adj. R-squared:                  1.000
Method:                 Least Squares   F-statistic:                 3.026e+29
Date:                Fri, 18 Oct 2024   Prob (F-statistic):          6.08e-173
Time:                        07:57:22   Log-Likelihood:                 372.97
No. Observations:                  15   AIC:                            -739.9
Df Residuals:                      12   BIC:                            -737.8
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
================================================================================
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
Intercept      500.0000   6.19e-12   8.07e+13      0.000     500.000     500.000
TV            -1.72e-15   1.33e-14     -0.129      0.900   -3.08e-14    2.74e-14
Radio            3.2000   1.57e-14   2.04e+14      0.000       3.200       3.200
Social_Media     1.6000   7.83e-15   2.04e+14      0.000       1.600       1.600
==============================================================================
Omnibus:                        2.048   Durbin-Watson:                   0.014
Prob(Omnibus):                  0.359   Jarque-Bera (JB):                1.554
Skew:                           0.649   Prob(JB):                        0.460
Kurtosis:                       2.104   Cond. No.                     1.00e+18
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 5.05e-29. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.

/Users/fabianlanderos/miniforge3/envs/quarto/lib/python3.12/site-packages/scipy/stats/_stats_py.py:1971: UserWarning: kurtosistest only valid for n>=20 ... continuing anyway, n=15
  k, _ = kurtosistest(a, axis)

# plot OLS model
df['Sales_pred'] = model.predict(df)

base = alt.Chart(df).encode(x=alt.X('Date', axis=alt.Axis(title='Date')))
sales_plot = base.mark_line(color='black').encode(
    y=alt.Y('Sales', axis=alt.Axis(title='Sales')),
    tooltip=['Date', 'Sales']
).properties(title='Actual Sales')

sales_pred_plot = base.mark_line(color='red').encode(
    y=alt.Y('Sales_pred', axis=alt.Axis(title='Sales')),
    tooltip=['Date', 'Sales_pred']
).properties(title='Predicted Sales')

combined_plot = sales_plot + sales_pred_plot
combined_plot.display()

# Plot the residuals
df['Residuals'] = model.resid
residuals_plot = base.mark_line(color='green').encode(
    y=alt.Y('Residuals', axis=alt.Axis(title='Residuals')),
    tooltip=['Date', 'Residuals']
).properties(title='Model Residuals')
residuals_plot.display()

Save as SVG Save as PNG View Source View Compiled Vega Open in Vega Editor

5. Interpreting the Results

The regression model output provides valuable insights into the effectiveness of each marketing channel. Here’s how to interpret the key components:

Coefficients: These indicate the estimated impact of each channel on sales. For example, a coefficient of 1.6 for social media suggests that, on average, a one-unit increase in social media spend is associated with a 1.6 unit increase in sales, holding other factors constant.
R-squared: This measures the proportion of variance in sales explained by the model.

Let’s dive deeper into the sm.ols model and the associated metrics.

Understanding `sm.ols`

sm.ols is a function from the statsmodels library in Python used for performing Ordinary Least Squares (OLS) regression. OLS is a common statistical method for estimating the relationship between a dependent variable (in our case, Sales) and one or more independent variables (our marketing channels: TV, Radio, Social_Media).

The core idea of OLS is to find the line that best fits the data by minimizing the sum of the squared differences between the observed sales values and the values predicted by the line. This line represents the linear relationship between marketing spend and sales.

Key Metrics from the Model Summary

The model.summary() output provides a wealth of information to assess the model’s fit and interpret the results. Here are some of the key metrics:

R-squared and Adjusted R-squared:
- R-squared represents the proportion of variance in the dependent variable (Sales) that is explained by the independent variables. It ranges from 0 to 1, with higher values indicating a better fit.
- Adjusted R-squared is a modified version of R-squared that adjusts for the number of predictors in the model. It penalizes the inclusion of unnecessary variables that don’t significantly improve the model’s explanatory power.
Coefficients:
- These are estimates of the effect of each independent variable on the dependent variable, holding other variables constant.
- Each coefficient has an associated standard error, which measures the uncertainty in the estimate.
- The t-statistic and P>|t| values help determine the statistical significance of each coefficient. A low P>|t| value (typically below 0.05) suggests that the predictor has a statistically significant impact on sales.
F-statistic and Prob (F-statistic):
- The F-statistic tests the overall significance of the regression model. It assesses whether at least one of the independent variables is significantly related to the dependent variable.
- Prob (F-statistic) is the p-value associated with the F-statistic. A low p-value indicates that the model is statistically significant.
AIC and BIC:
- AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) are measures of model fit that penalize models with more parameters. Lower values of AIC and BIC indicate a better balance between model fit and complexity.
Omnibus, Durbin-Watson, Jarque-Bera:
- These are diagnostic tests to check the assumptions of the linear regression model.
- Omnibus and Jarque-Bera test the normality of the residuals.
- Durbin-Watson tests for autocorrelation in the residuals.

Interpreting the Results in the Context of MMM

In the context of MMM, these metrics help us understand the effectiveness of our marketing mix. For instance:

Significant coefficients for TV, Radio, or Social_Media indicate that these channels have a measurable impact on sales.
The magnitude of the coefficients helps compare the relative effectiveness of different channels.
R-squared tells us how well our model captures the overall variation in sales.
Diagnostic tests help ensure the reliability of our results.

By carefully analyzing these metrics, marketers can make informed decisions about budget allocation, campaign optimization, and overall marketing strategy.