{ "cells": [ { "cell_type": "markdown", "id": "d9de8716", "metadata": { "colab_type": "text", "id": "view-in-github" }, "source": [ "# Gamma-gamma Model\n", "\n", "In this notebook we show how to fit a Gamma-Gamma model in PyMC-Marketing. We compare the results with the [`lifetimes`](https://github.com/CamDavidsonPilon/lifetimes) package (no longer maintained and last meaningful update was July 2020). The model is presented in the paper: Fader, P. S., & Hardie, B. G. (2013). [The Gamma-Gamma model of monetary value](http://www.brucehardie.com/notes/025/gamma_gamma.pdf). February, 2, 1-9." ] }, { "cell_type": "markdown", "id": "a579696d", "metadata": {}, "source": [ "## Prepare Notebook" ] }, { "cell_type": "code", "execution_count": null, "id": "813aa3e6", "metadata": {}, "outputs": [], "source": [ "import arviz as az\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "from lifetimes import GammaGammaFitter\n", "\n", "from pymc_marketing import clv\n", "\n", "# Plotting configuration\n", "az.style.use(\"arviz-darkgrid\")\n", "plt.rcParams[\"figure.figsize\"] = [10, 6]\n", "plt.rcParams[\"figure.dpi\"] = 100\n", "plt.rcParams[\"figure.facecolor\"] = \"white\"\n", "\n", "%load_ext autoreload\n", "%autoreload 2\n", "%config InlineBackend.figure_format = \"retina\"" ] }, { "cell_type": "markdown", "id": "b4e9df33", "metadata": {}, "source": [ "## Load Data\n", "\n", "We start by loading the `CDNOW` dataset." ] }, { "cell_type": "code", "execution_count": 2, "id": "4039ce96", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | frequency | \n", "recency | \n", "T | \n", "monetary_value | \n", "
---|---|---|---|---|
0 | \n", "2 | \n", "30.43 | \n", "38.86 | \n", "22.35 | \n", "
1 | \n", "1 | \n", "1.71 | \n", "38.86 | \n", "11.77 | \n", "
2 | \n", "0 | \n", "0.00 | \n", "38.86 | \n", "0.00 | \n", "
3 | \n", "0 | \n", "0.00 | \n", "38.86 | \n", "0.00 | \n", "
4 | \n", "0 | \n", "0.00 | \n", "38.86 | \n", "0.00 | \n", "
\n", " | frequency | \n", "recency | \n", "T | \n", "monetary_value | \n", "
---|---|---|---|---|
0 | \n", "2 | \n", "30.43 | \n", "38.86 | \n", "22.35 | \n", "
1 | \n", "1 | \n", "1.71 | \n", "38.86 | \n", "11.77 | \n", "
5 | \n", "7 | \n", "29.43 | \n", "38.86 | \n", "73.74 | \n", "
6 | \n", "1 | \n", "5.00 | \n", "38.86 | \n", "11.77 | \n", "
8 | \n", "2 | \n", "35.71 | \n", "38.86 | \n", "25.55 | \n", "
\n", " | monetary_value | \n", "frequency | \n", "
---|---|---|
monetary_value | \n", "1.000000 | \n", "0.113884 | \n", "
frequency | \n", "0.113884 | \n", "1.000000 | \n", "
\n", " | coef | \n", "se(coef) | \n", "lower 95% bound | \n", "upper 95% bound | \n", "
---|---|---|---|---|
p | \n", "6.248802 | \n", "1.189687 | \n", "3.917016 | \n", "8.580589 | \n", "
q | \n", "3.744588 | \n", "0.290166 | \n", "3.175864 | \n", "4.313313 | \n", "
v | \n", "15.447748 | \n", "4.159994 | \n", "7.294160 | \n", "23.601336 | \n", "
\n", " | \n", " | p | \n", "q | \n", "v | \n", "
---|---|---|---|---|
chain | \n", "draw | \n", "\n", " | \n", " | \n", " |
0 | \n", "0 | \n", "6.248787 | \n", "3.744591 | \n", "15.447813 | \n", "
<xarray.Dataset> Size: 208kB\n", "Dimensions: (chain: 4, draw: 2000)\n", "Coordinates:\n", " * chain (chain) int64 32B 0 1 2 3\n", " * draw (draw) int64 16kB 0 1 2 3 4 5 6 ... 1994 1995 1996 1997 1998 1999\n", "Data variables:\n", " p (chain, draw) float64 64kB 5.152 6.6 6.22 ... 6.451 5.998 5.995\n", " q (chain, draw) float64 64kB 3.742 3.767 3.896 ... 3.793 4.104 4.133\n", " v (chain, draw) float64 64kB 19.88 14.62 16.04 ... 15.13 17.79 18.07\n", "Attributes:\n", " created_at: 2024-04-05T07:27:09.778837\n", " arviz_version: 0.17.1\n", " inference_library: pymc\n", " inference_library_version: 5.11.0\n", " sampling_time: 36.096750020980835\n", " tuning_steps: 1000
<xarray.Dataset> Size: 992kB\n", "Dimensions: (chain: 4, draw: 2000)\n", "Coordinates:\n", " * chain (chain) int64 32B 0 1 2 3\n", " * draw (draw) int64 16kB 0 1 2 3 4 ... 1996 1997 1998 1999\n", "Data variables: (12/17)\n", " energy_error (chain, draw) float64 64kB -0.004048 ... 0.01401\n", " acceptance_rate (chain, draw) float64 64kB 1.0 0.9969 ... 0.9936\n", " step_size_bar (chain, draw) float64 64kB 0.07411 ... 0.08088\n", " perf_counter_start (chain, draw) float64 64kB 1.84e+04 ... 1.841e+04\n", " smallest_eigval (chain, draw) float64 64kB nan nan nan ... nan nan\n", " process_time_diff (chain, draw) float64 64kB 0.000842 ... 0.003041\n", " ... ...\n", " n_steps (chain, draw) float64 64kB 3.0 51.0 ... 79.0 15.0\n", " reached_max_treedepth (chain, draw) bool 8kB False False ... False False\n", " tree_depth (chain, draw) int64 64kB 2 6 5 4 5 6 ... 1 2 4 5 7 4\n", " lp (chain, draw) float64 64kB -4.053e+03 ... -4.051e+03\n", " largest_eigval (chain, draw) float64 64kB nan nan nan ... nan nan\n", " step_size (chain, draw) float64 64kB 0.05936 0.05936 ... 0.0734\n", "Attributes:\n", " created_at: 2024-04-05T07:27:09.804289\n", " arviz_version: 0.17.1\n", " inference_library: pymc\n", " inference_library_version: 5.11.0\n", " sampling_time: 36.096750020980835\n", " tuning_steps: 1000
<xarray.Dataset> Size: 30kB\n", "Dimensions: (index: 946)\n", "Coordinates:\n", " * index (index) int64 8kB 0 1 5 6 8 ... 2348 2349 2353 2355\n", "Data variables:\n", " customer_id (index) int64 8kB 0 1 5 6 8 ... 2348 2349 2353 2355\n", " mean_transaction_value (index) float64 8kB 22.35 11.77 ... 44.93 33.32\n", " frequency (index) int64 8kB 2 1 7 1 2 5 10 1 ... 1 2 7 1 2 5 4
\n", " | mean | \n", "sd | \n", "hdi_3% | \n", "hdi_97% | \n", "mcse_mean | \n", "mcse_sd | \n", "ess_bulk | \n", "ess_tail | \n", "r_hat | \n", "
---|---|---|---|---|---|---|---|---|---|
p | \n", "6.351 | \n", "1.333 | \n", "4.142 | \n", "8.819 | \n", "0.034 | \n", "0.024 | \n", "1625.0 | \n", "1973.0 | \n", "1.0 | \n", "
q | \n", "3.796 | \n", "0.298 | \n", "3.249 | \n", "4.359 | \n", "0.007 | \n", "0.005 | \n", "1850.0 | \n", "2160.0 | \n", "1.0 | \n", "
v | \n", "16.311 | \n", "4.428 | \n", "8.575 | \n", "24.606 | \n", "0.111 | \n", "0.079 | \n", "1565.0 | \n", "1947.0 | \n", "1.0 | \n", "
\n", " | mean | \n", "sd | \n", "hdi_3% | \n", "hdi_97% | \n", "
---|---|---|---|---|
x[0] | \n", "24.733 | \n", "0.528 | \n", "23.779 | \n", "25.706 | \n", "
x[1] | \n", "19.064 | \n", "1.351 | \n", "16.490 | \n", "21.478 | \n", "
x[2] | \n", "35.188 | \n", "0.931 | \n", "33.546 | \n", "37.002 | \n", "
x[3] | \n", "35.188 | \n", "0.931 | \n", "33.546 | \n", "37.002 | \n", "
x[4] | \n", "35.188 | \n", "0.931 | \n", "33.546 | \n", "37.002 | \n", "
x[5] | \n", "71.350 | \n", "0.628 | \n", "70.233 | \n", "72.528 | \n", "
x[6] | \n", "19.064 | \n", "1.351 | \n", "16.490 | \n", "21.478 | \n", "
x[7] | \n", "35.188 | \n", "0.931 | \n", "33.546 | \n", "37.002 | \n", "
x[8] | \n", "27.337 | \n", "0.405 | \n", "26.606 | \n", "28.091 | \n", "
x[9] | \n", "35.188 | \n", "0.931 | \n", "33.546 | \n", "37.002 | \n", "
\n", " | mean | \n", "sd | \n", "hdi_3% | \n", "hdi_97% | \n", "
---|---|---|---|---|
x | \n", "35.263 | \n", "0.634 | \n", "34.145 | \n", "36.489 | \n", "