{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Unobserved Heterogeneity and Finite Mixture Models\n",
    "\n",
    "Unobserved heterogeneity is a concern in every econometric application. Keane and Wolpin (1997) face the problem that individuals at the age of sixteen report varying years of schooling. Neglecting the issue of measurement error, it is unlikely that the differences in initial schooling are caused by exogenous factors. Instead, the schooling decision is affected by a variety of endogenous factors such as parental investement, school and teacher quality, intrinsic motivation, and ability. Without correction, estimation methods fail to recover the true parameters.\n",
    "\n",
    "One solution would be to extend the model and incorporate the whole human capital investement process up to the age where initial schooling was zero. Although such a model would be extremely interesting, it is also almost infeasible to model that many factors in terms of modeling, computation and data.\n",
    "\n",
    "Another solution is to employ individual fixed-effects. Then, the state space comprises a dimension with has the same number of unique values as there are individuals in the sample. Thus, you have to compute the decision rules for every individual for the whole state space separately which is computationally infeasible.\n",
    "\n",
    "Keane and Wolpin (1997) resort to model unobserved heterogeneity with a finite mixture. A mixture model can be used to model the presence of subpopulations (types) in the general population without requiring the observed data to identify the affiliation to a group. In contrast to fixed-effects, the number of subpopulations is much lower than the number of individuals. There is also no fixed and unique assignment to one subpopulation, but relations are defined by a probability mass function.\n",
    "\n",
    "Each type has a preference for a particular choice which is modeled by a constant in the utility functions. For working alternatives, $w$, the constant is in the log wage equation whereas for non-working alternatives, $n$, it is in the nonpecuniary reward. Note that **respy** allows for type-specific effects in every utility component. Keane and Wolpin (1997) call it endowment with the symbol $e_{ak}$ for type $k$ and alternative $a$.\n",
    "\n",
    "$$\\begin{align}\n",
    "    \\log(W(s_t, a_t)) = x^w\\beta^w  + e_{ak} + \\epsilon_{at}\\\\\n",
    "    N^n(s_t, a_t) = x^n\\beta^n + e_{ak} + \\epsilon_{at}\n",
    "\\end{align}$$\n",
    "\n",
    "To estimate model parameters with maximum likelihood, the likelihood contribution for one individual is defined as the joint probability of choices and wages accumulated over time.\n",
    "\n",
    "$$\n",
    "    P(\\{a_t\\}^T_{t=0} \\mid s^-_t, e_{ak}, W_t) =\n",
    "    \\prod^T_{t = 0} p(a_t, \\mid s^-_t, e_{ak}, W_t)\n",
    "$$\n",
    "\n",
    "We can weight the contribution for type $k$ with the probability for being the same type to get the unconditioned likelihood contribution of an individual.\n",
    "\n",
    "$$\n",
    "    P(\\{a_t, W_t\\}^T_{t=0}) = \\sum^K_{k=1} \\pi_k\n",
    "        P(\\{a_t\\}^T_{t=0} \\mid s^-_t, e_{ak}, W_t)\n",
    "$$\n",
    "\n",
    "To avoid misspecification of the likelihood, $\\pi_k$ must be a function of all individual characteristics which are determined before individuals enter the model horizon and are not the result of exogenous factors. The type-specific probability $\\pi_k = f(x^\\pi \\beta^\\pi)$ is calculated with softmax function based on a vector of covariates $x^\\pi$ and a matrix of coefficients $\\beta^\\pi$ for each type-covariate combination.\n",
    "\n",
    "$$\n",
    "    \\pi_k = f(x^\\pi \\beta^\\pi_k) =\n",
    "        \\frac{\\exp{\\{x^\\pi \\beta^\\pi_k\\}}}{\\sum^K_{k=1} \\exp \\{x^\\pi \\beta^\\pi_k\\}}\n",
    "$$\n",
    "\n",
    "To implement a finite mixture, we have to express $e_{ak}$ and $\\beta^\\pi$ in the parameters. As an example, we start with the basic Robinson Crusoe Economy. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import io\n",
    "import pandas as pd\n",
    "import respy as rp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>category</th>\n",
       "      <th>name</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>delta</th>\n",
       "      <th>delta</th>\n",
       "      <td>0.95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>wage_fishing</th>\n",
       "      <th>exp_fishing</th>\n",
       "      <td>0.30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>nonpec_fishing</th>\n",
       "      <th>constant</th>\n",
       "      <td>-0.20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>nonpec_hammock</th>\n",
       "      <th>constant</th>\n",
       "      <td>2.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">shocks_sdcorr</th>\n",
       "      <th>sd_fishing</th>\n",
       "      <td>0.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sd_hammock</th>\n",
       "      <td>0.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>corr_hammock_fishing</th>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                     value\n",
       "category       name                       \n",
       "delta          delta                  0.95\n",
       "wage_fishing   exp_fishing            0.30\n",
       "nonpec_fishing constant              -0.20\n",
       "nonpec_hammock constant               2.00\n",
       "shocks_sdcorr  sd_fishing             0.50\n",
       "               sd_hammock             0.50\n",
       "               corr_hammock_fishing   0.00"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params, options = rp.get_example_model(\"robinson_crusoe_basic\", with_data=False)\n",
    "params"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We extend the model by allowing for different periods of experience in fishing at $t = 0$. Robinsons starts with zero, one or two experience in fishing because of different tastes for fishing. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "initial_exp_fishing = \"\"\"\n",
    "category,name,value\n",
    "initial_exp_fishing_0,probability,0.33\n",
    "initial_exp_fishing_1,probability,0.33\n",
    "initial_exp_fishing_2,probability,0.34\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>category</th>\n",
       "      <th>name</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_0</th>\n",
       "      <th>probability</th>\n",
       "      <td>0.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_1</th>\n",
       "      <th>probability</th>\n",
       "      <td>0.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_2</th>\n",
       "      <th>probability</th>\n",
       "      <td>0.34</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                   value\n",
       "category              name              \n",
       "initial_exp_fishing_0 probability   0.33\n",
       "initial_exp_fishing_1 probability   0.33\n",
       "initial_exp_fishing_2 probability   0.34"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "initial_exp_fishing = pd.read_csv(io.StringIO(initial_exp_fishing), index_col=[\"category\", \"name\"])\n",
    "initial_exp_fishing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the next step, we add type-specific endowment effects $e_{ak}$. We assume that there exist three types and the additional utility is increasing from the first to the third type. For computational simplicity, the benefit of the first type is normalized to zero such that all other types are in relation to the first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "endowments = \"\"\"\n",
    "category,name,value\n",
    "wage_fishing,type_1,0.2\n",
    "wage_fishing,type_2,0.4\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>category</th>\n",
       "      <th>name</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">wage_fishing</th>\n",
       "      <th>type_1</th>\n",
       "      <td>0.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>type_2</th>\n",
       "      <td>0.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     value\n",
       "category     name         \n",
       "wage_fishing type_1    0.2\n",
       "             type_2    0.4"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "endowments = pd.read_csv(io.StringIO(endowments), index_col=[\"category\", \"name\"])\n",
    "endowments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We assume no effect for choosing the hammock.\n",
    "\n",
    "At last, we need to specify the probability mass function which relates individuals to types. We simply assume that initial experience is positively correlated with a stronger taste for fishing. For a comprehensive overview on how to specify distributions with multinomial coefficients, see the guide on the [initial conditions](how_to_initial_conditions.ipynb). Note that, the distribution is also only specified for type 1 and 2 and the coefficients for type 1 are left out for a parsimonuous representation. You cannot use probabilities as type assignment cannot be completely random. The following example is designed to specify a certain distribution and recover the pattern in the data. In reality, the distribution of unobservables is unknown.\n",
    "\n",
    "First, we define that Robinsons without prior experience are of type 0. Thus, we make the coefficients for type 1 and 2 extremely small. Robinsons with one prior experience are of type 1 with probability 0.66 and type 2 with 0.33. For two periods of experience for fishing, the share of type 1 individuals is 0.33 and of type 2 is 0.66. The coefficients for type 1 and 2 are simply the log of the probabilities.\n",
    "\n",
    "At last, we add a sufficiently large integer to all coefficients. The coefficient of type 0 is implicitly set to zero, so the distribution samples type 0 individuals for one or two experience in fishing. By shifting the parameters with a positive value, this is prevented. At the same time, the softmax function is shift-invariant and the relation of type 1 and type 2 shares is preserved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "type_probabilities = \"\"\"\n",
    "category,name,value\n",
    "type_1,initial_exp_fishing_0,-100\n",
    "type_1,initial_exp_fishing_1,-0.4055\n",
    "type_1,initial_exp_fishing_2,-1.0986\n",
    "type_2,initial_exp_fishing_0,-100\n",
    "type_2,initial_exp_fishing_1,-1.0986\n",
    "type_2,initial_exp_fishing_2,-0.4055\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>category</th>\n",
       "      <th>name</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">type_1</th>\n",
       "      <th>initial_exp_fishing_0</th>\n",
       "      <td>-90.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_1</th>\n",
       "      <td>9.5945</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_2</th>\n",
       "      <td>8.9014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">type_2</th>\n",
       "      <th>initial_exp_fishing_0</th>\n",
       "      <td>-90.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_1</th>\n",
       "      <td>8.9014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_2</th>\n",
       "      <td>9.5945</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  value\n",
       "category name                          \n",
       "type_1   initial_exp_fishing_0 -90.0000\n",
       "         initial_exp_fishing_1   9.5945\n",
       "         initial_exp_fishing_2   8.9014\n",
       "type_2   initial_exp_fishing_0 -90.0000\n",
       "         initial_exp_fishing_1   8.9014\n",
       "         initial_exp_fishing_2   9.5945"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type_probabilities = pd.read_csv(io.StringIO(type_probabilities), index_col=[\"category\", \"name\"])\n",
    "type_probabilities += 10\n",
    "type_probabilities"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The covariates used for the probabilities are defined below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'initial_exp_fishing_0': 'exp_fishing == 0',\n",
       " 'initial_exp_fishing_1': 'exp_fishing == 1',\n",
       " 'initial_exp_fishing_2': 'exp_fishing == 2'}"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type_covariates = {\n",
    "    \"initial_exp_fishing_0\": \"exp_fishing == 0\",\n",
    "    \"initial_exp_fishing_1\": \"exp_fishing == 1\",\n",
    "    \"initial_exp_fishing_2\": \"exp_fishing == 2\",\n",
    "}\n",
    "type_covariates"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the next step, we put all pieces together to get the complete model specification."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>category</th>\n",
       "      <th>name</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>delta</th>\n",
       "      <th>delta</th>\n",
       "      <td>0.9500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>wage_fishing</th>\n",
       "      <th>exp_fishing</th>\n",
       "      <td>0.3000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>nonpec_fishing</th>\n",
       "      <th>constant</th>\n",
       "      <td>-0.2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>nonpec_hammock</th>\n",
       "      <th>constant</th>\n",
       "      <td>2.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">shocks_sdcorr</th>\n",
       "      <th>sd_fishing</th>\n",
       "      <td>0.5000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sd_hammock</th>\n",
       "      <td>0.5000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>corr_hammock_fishing</th>\n",
       "      <td>0.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_0</th>\n",
       "      <th>probability</th>\n",
       "      <td>0.3300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_1</th>\n",
       "      <th>probability</th>\n",
       "      <td>0.3300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_2</th>\n",
       "      <th>probability</th>\n",
       "      <td>0.3400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">wage_fishing</th>\n",
       "      <th>type_1</th>\n",
       "      <td>0.2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>type_2</th>\n",
       "      <td>0.4000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">type_1</th>\n",
       "      <th>initial_exp_fishing_0</th>\n",
       "      <td>-90.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_1</th>\n",
       "      <td>9.5945</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_2</th>\n",
       "      <td>8.9014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">type_2</th>\n",
       "      <th>initial_exp_fishing_0</th>\n",
       "      <td>-90.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_1</th>\n",
       "      <td>8.9014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>initial_exp_fishing_2</th>\n",
       "      <td>9.5945</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                               value\n",
       "category              name                          \n",
       "delta                 delta                   0.9500\n",
       "wage_fishing          exp_fishing             0.3000\n",
       "nonpec_fishing        constant               -0.2000\n",
       "nonpec_hammock        constant                2.0000\n",
       "shocks_sdcorr         sd_fishing              0.5000\n",
       "                      sd_hammock              0.5000\n",
       "                      corr_hammock_fishing    0.0000\n",
       "initial_exp_fishing_0 probability             0.3300\n",
       "initial_exp_fishing_1 probability             0.3300\n",
       "initial_exp_fishing_2 probability             0.3400\n",
       "wage_fishing          type_1                  0.2000\n",
       "                      type_2                  0.4000\n",
       "type_1                initial_exp_fishing_0 -90.0000\n",
       "                      initial_exp_fishing_1   9.5945\n",
       "                      initial_exp_fishing_2   8.9014\n",
       "type_2                initial_exp_fishing_0 -90.0000\n",
       "                      initial_exp_fishing_1   8.9014\n",
       "                      initial_exp_fishing_2   9.5945"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params = params.append([initial_exp_fishing, endowments, type_probabilities])\n",
    "params"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'solution_draws': 100,\n",
       " 'solution_seed': 456,\n",
       " 'n_periods': 5,\n",
       " 'simulation_agents': 10000,\n",
       " 'simulation_seed': 132,\n",
       " 'estimation_draws': 100,\n",
       " 'estimation_seed': 100,\n",
       " 'estimation_tau': 0.001,\n",
       " 'interpolation_points': -1,\n",
       " 'covariates': {'constant': '1',\n",
       "  'initial_exp_fishing_0': 'exp_fishing == 0',\n",
       "  'initial_exp_fishing_1': 'exp_fishing == 1',\n",
       "  'initial_exp_fishing_2': 'exp_fishing == 2'}}"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "options[\"covariates\"] = {**options[\"covariates\"], **type_covariates}\n",
    "options[\"simulation_agents\"] = 10_000\n",
    "options"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us simulate a dataset to see whether the distribution of types can be recovered from the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "simulate = rp.get_simulate_func(params, options)\n",
    "df = simulate(params)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Type</th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Experience_Fishing</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.665548</td>\n",
       "      <td>0.334452</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.000296</td>\n",
       "      <td>0.330278</td>\n",
       "      <td>0.669426</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Type                       0         1         2\n",
       "Experience_Fishing                              \n",
       "0                   1.000000  0.000000  0.000000\n",
       "1                   0.000000  0.665548  0.334452\n",
       "2                   0.000296  0.330278  0.669426"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.query(\"Period == 0\").groupby(\"Experience_Fishing\").Type.value_counts(normalize=\"rows\").unstack().fillna(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also know that type 1 and 2 experience a higher utility for choosing fishing. Here are the choice probabilities for each type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Choice</th>\n",
       "      <th>fishing</th>\n",
       "      <th>hammock</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Type</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.426571</td>\n",
       "      <td>0.573429</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.992602</td>\n",
       "      <td>0.007398</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.998036</td>\n",
       "      <td>0.001964</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Choice   fishing   hammock\n",
       "Type                      \n",
       "0       0.426571  0.573429\n",
       "1       0.992602  0.007398\n",
       "2       0.998036  0.001964"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(\"Type\").Choice.value_counts(normalize=True).unstack()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}