{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Measures of Income Mobility \n",
    "\n",
    "**Author: Wei Kang <weikang9009@gmail.com>, Serge Rey <sjsrey@gmail.com>**\n",
    "\n",
    "Income mobility could be viewed as a reranking pheonomenon where regions switch income positions while it could also be considered to be happening as long as regions move away from the previous income levels. The former is named absolute mobility and the latter relative mobility.\n",
    "\n",
    "This notebook introduces how to estimate income mobility measures from longitudinal income data using methods in `giddy`. Currently, five summary mobility estimators are implemented in `giddy.mobility`. All of them are Markov-based, meaning that they are closely related to the discrete Markov Chains methods introduced in [Markov Based Methods notebook](MarkovBasedMethods.ipynb). More specifically, each of them is derived from a transition probability matrix $P$. Whether the final estimate is absolute or reletive mobility depends on how the original continuous income data are discretized.\n",
    "\n",
    "The five Markov-based summary measures of mobility (Formby et al., 2004) are listed below:\n",
    "\n",
    "| Num| Measures        |      Symbol      | \n",
    "|-------------| :-------------: |:-------------:|\n",
    "|1| $M_P(P) = \\frac{m-\\sum_{i=1}^m p_{ii}}{m-1}$ | P |\n",
    "|2| $M_D(P) = 1- \\left|det(P)\\right|$   |D   |   \n",
    "|3| $M_{L2}(P) = 1-\\left|\\lambda_2 \\right|$| L2| \n",
    "|4| $M_{B1}(P) = \\frac{m-m \\sum_{i=1}^m \\pi_i P_{ii}}{m-1}$  |   B1      |   \n",
    "|5| $M_{B2}(P) = \\frac{1}{m-1} \\sum_{i=1}^m \\sum_{j=1}^m \\pi_i P_{ij} \\left|i-j \\right|$| B2| \n",
    "\n",
    "$\\pi$ is the inital income distribution. For any transition probability matrix with a quasi-maximal diagonal, all of these mobility measures take values on $[0,1]$. $0$ means immobility and $1$ perfect mobility. If the transition probability matrix takes the form of the identity matrix, every region is stuck in its current state implying complete immobility. On the contrary, when each row of $P$ is identical, current state is irrelevant to the probability of moving away to any class. Thus, the transition matrix with identical rows is considered perfect mobile. The larger the mobility estimate, the more mobile the regional income system is. However, it should be noted that these measures try to reveal mobility pattern from different aspects and are thus not comparable to each other. Actually the mean and variance of these measures are different.\n",
    "\n",
    "We implemented all the above five summary mobility measures in a single method `markov_mobility`. A parameter `measure` could be specified to select which measure to calculate. By default, the mobility measure 'P' will be estimated.\n",
    "\n",
    "```python\n",
    "def markov_mobility(p, measure=\"P\", ini=None)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mSignature:\u001b[0m \u001b[0mmobility\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmarkov_mobility\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mp\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmeasure\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'P'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mini\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;31mDocstring:\u001b[0m\n",
       "Markov-based mobility index.\n",
       "\n",
       "Parameters\n",
       "----------\n",
       "p       : array\n",
       "          (k, k), Markov transition probability matrix.\n",
       "measure : string\n",
       "          If measure= \"P\",\n",
       "          :math:`M_{P} = \\frac{m-\\sum_{i=1}^m P_{ii}}{m-1}`;\n",
       "          if measure = \"D\",\n",
       "          :math:`M_{D} = 1 - |\\det(P)|`,\n",
       "          where :math:`\\det(P)` is the determinant of :math:`P`;\n",
       "          if measure = \"L2\",\n",
       "          :math:`M_{L2} = 1  - |\\lambda_2|`,\n",
       "          where :math:`\\lambda_2` is the second largest eigenvalue of\n",
       "          :math:`P`;\n",
       "          if measure = \"B1\",\n",
       "          :math:`M_{B1} = \\frac{m-m \\sum_{i=1}^m \\pi_i P_{ii}}{m-1}`,\n",
       "          where :math:`\\pi` is the initial income distribution;\n",
       "          if measure == \"B2\",\n",
       "          :math:`M_{B2} = \\frac{1}{m-1} \\sum_{i=1}^m \\sum_{\n",
       "          j=1}^m \\pi_i P_{ij} |i-j|`,\n",
       "          where :math:`\\pi` is the initial income distribution.\n",
       "ini     : array\n",
       "          (k,), initial distribution. Need to be specified if\n",
       "          measure = \"B1\" or \"B2\". If not,\n",
       "          the initial distribution would be treated as a uniform\n",
       "          distribution.\n",
       "\n",
       "Returns\n",
       "-------\n",
       "mobi    : float\n",
       "          Mobility value.\n",
       "\n",
       "Notes\n",
       "-----\n",
       "The mobility indices are based on :cite:`Formby:2004fk`.\n",
       "\n",
       "Examples\n",
       "--------\n",
       ">>> import numpy as np\n",
       ">>> import libpysal\n",
       ">>> import mapclassify as mc\n",
       ">>> from giddy.markov import Markov\n",
       ">>> from giddy.mobility import markov_mobility\n",
       ">>> f = libpysal.io.open(libpysal.examples.get_path(\"usjoin.csv\"))\n",
       ">>> pci = np.array([f.by_col[str(y)] for y in range(1929,2010)])\n",
       ">>> q5 = np.array([mc.Quantiles(y).yb for y in pci]).transpose()\n",
       ">>> m = Markov(q5)\n",
       "The Markov Chain is irreducible and is composed by:\n",
       "1 Recurrent class (indices):\n",
       "[0 1 2 3 4]\n",
       "0 Transient classes.\n",
       "The Markov Chain has 0 absorbing states.\n",
       ">>> m.p\n",
       "array([[0.91011236, 0.0886392 , 0.00124844, 0.        , 0.        ],\n",
       "       [0.09972299, 0.78531856, 0.11080332, 0.00415512, 0.        ],\n",
       "       [0.        , 0.10125   , 0.78875   , 0.1075    , 0.0025    ],\n",
       "       [0.        , 0.00417827, 0.11977716, 0.79805014, 0.07799443],\n",
       "       [0.        , 0.        , 0.00125156, 0.07133917, 0.92740926]])\n",
       "\n",
       "(1) Estimate Shorrock1 mobility index:\n",
       "\n",
       ">>> mobi_1 = markov_mobility(m.p, measure=\"P\")\n",
       ">>> print(\"{:.5f}\".format(mobi_1))\n",
       "0.19759\n",
       "\n",
       "(2) Estimate Shorrock2 mobility index:\n",
       "\n",
       ">>> mobi_2 = markov_mobility(m.p, measure=\"D\")\n",
       ">>> print(\"{:.5f}\".format(mobi_2))\n",
       "0.60685\n",
       "\n",
       "(3) Estimate Sommers and Conlisk mobility index:\n",
       "\n",
       ">>> mobi_3 = markov_mobility(m.p, measure=\"L2\")\n",
       ">>> print(\"{:.5f}\".format(mobi_3))\n",
       "0.03978\n",
       "\n",
       "(4) Estimate Bartholomew1 mobility index (note that the initial\n",
       "distribution should be given):\n",
       "\n",
       ">>> ini = np.array([0.1,0.2,0.2,0.4,0.1])\n",
       ">>> mobi_4 = markov_mobility(m.p, measure = \"B1\", ini=ini)\n",
       ">>> print(\"{:.5f}\".format(mobi_4))\n",
       "0.22777\n",
       "\n",
       "(5) Estimate Bartholomew2 mobility index (note that the initial\n",
       "distribution should be given):\n",
       "\n",
       ">>> ini = np.array([0.1,0.2,0.2,0.4,0.1])\n",
       ">>> mobi_5 = markov_mobility(m.p, measure = \"B2\", ini=ini)\n",
       ">>> print(\"{:.5f}\".format(mobi_5))\n",
       "0.04637\n",
       "\u001b[0;31mFile:\u001b[0m      ~/giddy/giddy/mobility.py\n",
       "\u001b[0;31mType:\u001b[0m      function"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import warnings\n",
    "\n",
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter(\"ignore\")\n",
    "    # ignore NumbaDeprecationWarning: gh-pysal/libpysal#560\n",
    "    from giddy import markov, mobility\n",
    "\n",
    "?mobility.markov_mobility"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### US income mobility example\n",
    "Similar to [Markov Based Methods notebook](MarkovBasedMethods.ipynb), we will demonstrate the usage of the mobility methods by an application to data on per capita incomes observed annually from 1929 to 2009 for the lower 48 US states."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter(\"ignore\")\n",
    "    # ignore NumbaDeprecationWarning: gh-pysal/libpysal#560\n",
    "    import libpysal\n",
    "\n",
    "import numpy as np\n",
    "import mapclassify as mc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The Markov Chain is irreducible and is composed by:\n",
      "1 Recurrent class (indices):\n",
      "[0 1 2 3 4]\n",
      "0 Transient classes.\n",
      "The Markov Chain has 0 absorbing states.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([[0.91011236, 0.0886392 , 0.00124844, 0.        , 0.        ],\n",
       "       [0.09972299, 0.78531856, 0.11080332, 0.00415512, 0.        ],\n",
       "       [0.        , 0.10125   , 0.78875   , 0.1075    , 0.0025    ],\n",
       "       [0.        , 0.00417827, 0.11977716, 0.79805014, 0.07799443],\n",
       "       [0.        , 0.        , 0.00125156, 0.07133917, 0.92740926]])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "income_path = libpysal.examples.get_path(\"usjoin.csv\")\n",
    "f = libpysal.io.open(income_path)\n",
    "# each column represents an state's income time series 1929-2010\n",
    "pci = np.array([f.by_col[str(y)] for y in range(1929, 2010)])\n",
    "# each row represents an state's income time series 1929-2010\n",
    "q5 = np.array([mc.Quantiles(y).yb for y in pci]).transpose()\n",
    "m = markov.Markov(q5)\n",
    "m.p"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After acquiring the estimate of transition probability matrix, we could call the method `markov_mobility` to estimate any of the five Markov-based summary mobility indice.\n",
    "\n",
    "### 1. Shorrock1's mobility measure\n",
    "\n",
    "\\begin{equation}\n",
    "M_{P} = \\frac{m-\\sum_{i=1}^m P_{ii}}{m-1}\n",
    "\\end{equation}\n",
    "\n",
    "```python\n",
    "measure = \"P\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.19758992000997844"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mobility.markov_mobility(m.p, measure=\"P\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Shorroks2's mobility measure\n",
    "\n",
    "\\begin{equation}\n",
    "M_{D} = 1 - |\\det(P)|\n",
    "\\end{equation}\n",
    "\n",
    "```python\n",
    "measure = \"D\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.606848546236956"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mobility.markov_mobility(m.p, measure=\"D\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. Sommers and Conlisk's mobility measure\n",
    "\n",
    "\\begin{equation}\n",
    "M_{L2} = 1  - |\\lambda_2|\n",
    "\\end{equation}\n",
    "\n",
    "```python\n",
    "measure = \"L2\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.03978200230815976"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mobility.markov_mobility(m.p, measure=\"L2\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. Bartholomew1's mobility measure\n",
    "\n",
    "\\begin{equation}\n",
    "M_{B1} = \\frac{m-m \\sum_{i=1}^m \\pi_i P_{ii}}{m-1}\n",
    "\\end{equation}\n",
    "\n",
    "$\\pi$: the inital income distribution\n",
    "\n",
    "```python\n",
    "measure = \"B1\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.2277675878319787"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pi = np.array([0.1, 0.2, 0.2, 0.4, 0.1])\n",
    "mobility.markov_mobility(m.p, measure=\"B1\", ini=pi)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5. Bartholomew2's mobility measure\n",
    "\n",
    "\\begin{equation}\n",
    "M_{B2} = \\frac{1}{m-1} \\sum_{i=1}^m \\sum_{j=1}^m \\pi_i P_{ij} |i-j|\n",
    "\\end{equation}\n",
    "\n",
    "$\\pi$: the inital income distribution\n",
    "\n",
    "```python\n",
    "measure = \"B1\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.04636660119478926"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pi = np.array([0.1, 0.2, 0.2, 0.4, 0.1])\n",
    "mobility.markov_mobility(m.p, measure=\"B2\", ini=pi)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Next steps\n",
    "\n",
    "* Markov-based partial mobility measures\n",
    "* Other mobility measures:\n",
    "    * Inequality reduction mobility measures (Trede, 1999)\n",
    "* Statistical inference for mobility measures\n",
    "\n",
    "## References\n",
    "\n",
    "* Formby, J. P., W. J. Smith, and B. Zheng. 2004. “[Mobility Measurement, Transition Matrices and Statistical Inference](http://www.sciencedirect.com/science/article/pii/S0304407603002112).” Journal of Econometrics 120 (1). Elsevier: 181–205.\n",
    "* Trede, Mark. 1999. “[Statistical Inference for Measures of Income Mobility / Statistische Inferenz Zur Messung Der Einkommensmobilität](https://www.jstor.org/stable/23812388).” Jahrbücher Für Nationalökonomie Und Statistik / Journal of Economics and Statistics 218 (3/4). Lucius & Lucius Verlagsgesellschaft mbH: 473–90."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:py311_giddy]",
   "language": "python",
   "name": "conda-env-py311_giddy-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}