View and download the notebook here!

Observables#

In the tutorial on params, options, and simulation, we simulated a population of identical individuals: The difference in their behavior was solely due to different random shocks to the reward associated with a choice. In more realistic models, individuals can differ with respect to multiple characteristics, which need to be sampled at the start of the simulation. These characteristics can be:

Experience. Individuals can start with nonzero years of experience for some choice.
Lagged choices. The previous (lagged) choice in the first period can be a subset of all choices in the model.
Observables. An observed characteristic, which does not change over the time-horizon of the model, is not evenly distributed in the population.

Taken together, the assumptions on these characteristics are called the initial conditions of a model. An initial condition is also called a seed value and determines the value of a variable in the first period of a dynamic system.

In this tutorial we will learn how to enrich our baseline Robinson Crusoe economy with observables: The simulated Robinsons will differ with respect to the conditions they experience on the island, which will enter directly the reward for a choice and therefore potentially determine different conditional choice probabilities.

Similarly, in more realistic models, observables such as demographic characteristics or measures of ability need to be controlled for, as they may influence the agents’ behavior.

[1]:

%matplotlib inline

import pandas as pd
import respy as rp
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.graphics.mosaicplot import mosaic

# Plot style
sns.set_style("white")
sns.set_context("notebook", font_scale=1.5)

The model: a simple Robinson Crusoe economy, revisited#

We revisit the basic Robinson Crusoe economy. We add one observable characteristic to the baseline model, "Fishing_Grounds": Now Robinson can end up, with a certain probability, on the side of the island which has "poor" or "rich" fishing grounds. Experiencing rich fishing grounds affects the non-pecuniary reward for fishing:

\[\begin{split}\begin{align} N^f = \alpha^f + \zeta^f \unicode{x1D7D9}_{\{FG = "rich"\}} \\ \end{align}\end{split}\]

The indicator function \(\unicode{x1D7D9}_{\{condition\}}\) takes value 1 when the condition is true and value 0 otherwise: Therefore, if Robinson finds himself in rich fishing grounds, his total non-pecuniary rewards from fishing will be equal to \(\alpha^f + \zeta^f\).

Tutorials Find out more about the basic Robinson Crusoe economy in params, options, and simulation.

Specification: `params` and `options`#

To introduce observables we need to modify both params and options. The observable needs to be identified by the keyword observable_*_*, while we can use labels to identify its levels (in this case, "rich" and "poor"). Everything after the last underscore is considered to be the level’s label.

First, we load the specifications of the basic model:

[2]:

params, options = rp.get_example_model("robinson_crusoe_basic", with_data=False)

Then, we add three additional rows to params, to specify:

The probability with which Robinson will find himself in rich and in poor fishing grounds;
The value of \(\zeta^f\), which here is set to be positive and constant.

respy allows for complex probability distributions of observables, which may for instance depend on other covariates. However, throughout this tutorial, we will assume that the observables’ probability distributions do not depend on any other information, and we will add them to the model via probability mass function: Each Robinson is randomly assigned to a certain side of the island, according to the float specified under value in the name-level probability.

Note that all probabilities sum to one. If that is not the case, respy will emit a warning and normalize probabilities.

[3]:

params.loc[("observable_fishing_grounds_rich", "probability"), "value"] = 0.5
params.loc[("observable_fishing_grounds_poor", "probability"), "value"] = 0.5
params.loc[("nonpec_fishing", "rich_fishing_grounds"), "value"] = 0.3

[4]:

params

[4]:

		value
category	name
delta	delta	0.95
wage_fishing	exp_fishing	0.30
nonpec_fishing	constant	-0.20
nonpec_hammock	constant	2.00
shocks_sdcorr	sd_fishing	0.50
	sd_hammock	0.50
	corr_hammock_fishing	0.00
observable_fishing_grounds_rich	probability	0.50
observable_fishing_grounds_poor	probability	0.50
nonpec_fishing	rich_fishing_grounds	0.30

We also need to overwrite the covariates section of options to include which level of the observable is associated with a higher nonpecuniary reward for fishing:

[5]:

options["covariates"] = {
    "constant": "1",
    "rich_fishing_grounds": "fishing_grounds == 'rich'",
}

To how-to guide Find out how to specify more complex distributions of observables in the how-to guide on Initial conditions.

Simulation#

We will now sample and simulate 1000 Robinsons, which will differ with respect to their "Fishing_Grounds" value. We will then let the decision rule from the solution of the model guide them for 5 periods, during which their "Fishing_Grounds" value assigned at the start of the simulation cannot change.

[6]:

simulate = rp.get_simulate_func(params, options)
df = simulate(params)

Note that the new characteristic is displayed in a column of the resulting dataset:

[7]:

df.head(20)

[7]:

		Experience_Fishing	Fishing_Grounds	Shock_Reward_Fishing	Meas_Error_Wage_Fishing	Shock_Reward_Hammock	Meas_Error_Wage_Hammock	Choice	Wage	Discount_Rate	Present_Bias	Nonpecuniary_Reward_Fishing	Wage_Fishing	Flow_Utility_Fishing	Value_Function_Fishing	Continuation_Value_Fishing	Nonpecuniary_Reward_Hammock	Wage_Hammock	Flow_Utility_Hammock	Value_Function_Hammock	Continuation_Value_Hammock
Identifier	Period
0	0	0	rich	1.431303	1	0.515252	1	fishing	1.431303	0.95	1	0.1	1.431303	1.531303	10.784925	9.740654	2	NaN	2.515252	10.132237	8.017878
	1	1	rich	0.383519	1	0.529793	1	hammock	NaN	0.95	1	0.1	0.517697	0.617697	8.723622	8.532553	2	NaN	2.529793	8.988758	6.798911
	2	1	rich	0.950278	1	-0.189833	1	fishing	1.282740	0.95	1	0.1	1.282740	1.382740	6.354070	5.232979	2	NaN	1.810167	6.025253	4.436933
	3	2	rich	0.582585	1	-0.585088	1	fishing	1.061539	0.95	1	0.1	1.061539	1.161539	4.093131	3.085887	2	NaN	1.414912	3.822200	2.533988
	4	3	rich	1.680125	1	-0.108781	1	fishing	4.132441	0.95	1	0.1	4.132441	4.232441	4.232441	0.000000	2	NaN	1.891219	1.891219	0.000000
1	0	0	rich	1.419559	1	1.121115	1	fishing	1.419559	0.95	1	0.1	1.419559	1.519559	10.773181	9.740654	2	NaN	3.121115	10.738100	8.017878
	1	1	rich	2.408754	1	0.133023	1	fishing	3.251478	0.95	1	0.1	3.251478	3.351478	11.457404	8.532553	2	NaN	2.133023	8.591988	6.798911
	2	2	rich	0.655700	1	0.650588	1	fishing	1.194763	0.95	1	0.1	1.194763	1.294763	7.633632	6.672494	2	NaN	2.650588	7.621918	5.232979
	3	3	rich	0.464923	1	-0.308845	1	fishing	1.143526	0.95	1	0.1	1.143526	1.243526	5.014991	3.969963	2	NaN	1.691155	4.622748	3.085887
	4	4	rich	2.757647	1	-0.133189	1	fishing	9.155711	0.95	1	0.1	9.155711	9.255711	9.255711	0.000000	2	NaN	1.866811	1.866811	0.000000
2	0	0	rich	1.116904	1	-1.094805	1	fishing	1.116904	0.95	1	0.1	1.116904	1.216904	10.470526	9.740654	2	NaN	0.905195	8.522180	8.017878
	1	1	rich	0.896039	1	0.452955	1	fishing	1.209527	0.95	1	0.1	1.209527	1.309527	9.415452	8.532553	2	NaN	2.452955	8.911920	6.798911
	2	2	rich	0.461766	1	0.762777	1	hammock	NaN	0.95	1	0.1	0.841392	0.941392	7.280262	6.672494	2	NaN	2.762777	7.734107	5.232979
	3	2	rich	1.350840	1	0.571080	1	fishing	2.461392	0.95	1	0.1	2.461392	2.561392	5.492984	3.085887	2	NaN	2.571080	4.978368	2.533988
	4	3	rich	0.776213	1	0.410387	1	hammock	NaN	0.95	1	0.1	1.909176	2.009176	2.009176	0.000000	2	NaN	2.410387	2.410387	0.000000
3	0	0	poor	1.106631	1	-0.060911	1	fishing	1.106631	0.95	1	-0.2	1.106631	0.906631	9.296601	8.831548	2	NaN	1.939089	9.258196	7.704322
	1	1	poor	0.383690	1	-0.377365	1	fishing	0.517928	0.95	1	-0.2	0.517928	0.317928	7.736349	7.808865	2	NaN	1.622635	7.695893	6.392903
	2	2	poor	1.798205	1	-0.600881	1	fishing	3.276543	0.95	1	-0.2	3.276543	3.076543	8.943197	6.175425	2	NaN	1.399119	6.045155	4.890564
	3	3	poor	1.734778	1	-0.466337	1	fishing	4.266866	0.95	1	-0.2	4.266866	4.066866	7.597816	3.716790	2	NaN	1.533663	4.281438	2.892395
	4	4	poor	0.861123	1	-0.354589	1	fishing	2.859029	0.95	1	-0.2	2.859029	2.659029	2.659029	0.000000	2	NaN	1.645411	1.645411	0.000000

Robinson’s behavior is affected by the observable we introduced: The figure below shows that rich fishing grounds lead to higher engagement in fishing.

[8]:

fig, ax = plt.subplots(1, 2, figsize=(14, 5))

for i, observable in enumerate(["rich", "poor"]):
    df.query("Fishing_Grounds == @observable").groupby("Period").Choice.value_counts(
        normalize=True,
    ).unstack().plot.bar(width=0.4, stacked=True, rot=0, legend=False, ax=ax[i])
    ax[i].set_title("Fishing grounds: " + observable, pad=10)
    ax[i].xaxis.label.set_visible(False)

plt.legend(loc="lower center", bbox_to_anchor=(-0.15, -0.3), ncol=2)
plt.suptitle("Robinson's choices by period", y=1.05)

plt.show()

../_images/tutorials_tutorial_observables_22_0.png

Multiple observables#

On top of "Fishing_Grounds we add now a second observable, "Cicadas", which also has two evenly distributed levels: "many" or "few". Ending up on a side of the island where many cicadas live affects, this time negatively, the non-pecuniary reward for relaxing on the hammock:

\[\begin{split}\begin{align} N^h = \alpha^h + \zeta^h \unicode{x1D7D9}_{\{C = "many"\}} \\ \end{align}\end{split}\]

where \(\zeta^h < 0\). The intuition is simple: Robinson finds it less pleasant to spend time on his hammock when he is surrounded by many noisy cicadas.

We again modify params and options to include this new characteristic:

[9]:

params.loc[("observable_cicadas_few", "probability"), "value"] = 0.5
params.loc[("observable_cicadas_many", "probability"), "value"] = 0.5
params.loc[("nonpec_hammock", "many_cicadas"), "value"] = -0.15

[10]:

options["covariates"] = {
    "constant": "1",
    "rich_fishing_grounds": "fishing_grounds == 'rich'",
    "many_cicadas": "cicadas == 'many'",
}

When inspecting a simulated dataset, we can see that the observable "Cicadas" has now its column:

[11]:

simulate = rp.get_simulate_func(params, options)
df_eq = simulate(params)

[12]:

df_eq.head()

[12]:

		Experience_Fishing	Cicadas	Fishing_Grounds	Shock_Reward_Fishing	Meas_Error_Wage_Fishing	Shock_Reward_Hammock	Meas_Error_Wage_Hammock	Choice	Wage	Discount_Rate	...	Nonpecuniary_Reward_Fishing	Wage_Fishing	Flow_Utility_Fishing	Value_Function_Fishing	Continuation_Value_Fishing	Nonpecuniary_Reward_Hammock	Wage_Hammock	Flow_Utility_Hammock	Value_Function_Hammock	Continuation_Value_Hammock
Identifier	Period
0	0	0	many	poor	1.431303	1	0.515252	1	fishing	1.431303	0.95	...	-0.2	1.431303	1.231303	9.504631	8.708766	1.85	NaN	2.365252	9.274431	7.272819
	1	1	many	poor	0.383519	1	0.529793	1	hammock	NaN	0.95	...	-0.2	0.517697	0.317697	7.664871	7.733868	1.85	NaN	2.379793	8.217822	6.145294
	2	1	many	poor	0.950278	1	-0.189833	1	fishing	1.282740	0.95	...	-0.2	1.282740	1.082740	5.605257	4.760544	1.85	NaN	1.660167	5.500165	4.042104
	3	2	many	poor	0.582585	1	-0.585088	1	fishing	1.061539	0.95	...	-0.2	1.061539	0.861539	3.555348	2.835589	1.85	NaN	1.264912	3.464384	2.315234
	4	3	many	poor	1.680125	1	-0.108781	1	fishing	4.132441	0.95	...	-0.2	4.132441	3.932441	3.932441	0.000000	1.85	NaN	1.741219	1.741219	0.000000

5 rows × 21 columns

Note that Cicadas and Fishing_Grounds are independent, as we did not specify any additional constraint on their probability distribution.

We can decrease Robinson’s probability of experiencing many cicadas to show how the observables’ distribution changes.

[13]:

params.loc[("observable_cicadas_many", "probability"), "value"] = 0.35
params.loc[("observable_cicadas_few", "probability"), "value"] = 0.65

[14]:

simulate = rp.get_simulate_func(params, options)
df_diff = simulate(params)

[15]:

fig, ax = plt.subplots(1, 2, figsize=(14, 5))

colors = ["#ff7f0e", "#70a8d0", "#ffb369", "#428dc1"]
observables = [
    ("poor", "few"),
    ("poor", "many"),
    ("rich", "few"),
    ("rich", "many"),
]
titles = ["Evenly distributed observables", "Many cicadas less likely"]

for i, df in enumerate([df_eq, df_diff]):

    crosstab = pd.crosstab(df["Fishing_Grounds"], df["Cicadas"], normalize="all")

    properties_dict = {}
    for observable, color in zip(observables, colors):
        properties = {observable: [color, "{:.1%}".format(crosstab.loc[observable])]}
        properties_dict.update(properties)

    mosaic(
        df,
        ["Fishing_Grounds", "Cicadas"],
        ax=ax[i],
        properties=lambda key: {"color": properties_dict[key][0],},
        labelizer=lambda key: properties_dict[key][1],
        gap=0.01,
    )

    ax[i].set_title(titles[i], pad=10)

ax[0].set_xlabel("Fishing Grounds", x=1.08)
ax[0].set_ylabel("Cicadas")

plt.suptitle("Distribution of observables", y=1.05)
plt.show()

../_images/tutorials_tutorial_observables_33_0.png

Moreover, we can investigate how the within-sample behavior of Robinson changes according to the fishing grounds and the number of cicadas that he experiences:

[16]:

fig, ax = plt.subplots(2, 2, figsize=(14, 10))

ax = ax.flatten()

plt.subplots_adjust(hspace=0.25)

for i, observable in enumerate(observables):
    (
        df_eq.query("Fishing_Grounds == @observable[0] and Cicadas == @observable[1]")
        .groupby("Period")
        .Choice.value_counts(normalize=True)
        .unstack()
        .plot.bar(width=0.4, stacked=True, rot=0, ax=ax[i], legend=False)
    )
    ax[i].xaxis.label.set_visible(False)
    ax[i].set_title(
        observable[0] + " fishing grounds, " + observable[1] + " cicadas", pad=10
    )

plt.legend(loc="right", bbox_to_anchor=(0.3, -0.2), ncol=2)
plt.suptitle("Robinson's choices by period")

plt.show()

../_images/tutorials_tutorial_observables_35_0.png

The figure shows that different realizations of observables lead to different incentives for Robinson: His engagement in fishing decreases with poor fishing grounds or few cicadas, while it increases with rich fishing grounds and many cicadas.

Observables#

The model: a simple Robinson Crusoe economy, revisited#

Specification: params and options#

Simulation#

Multiple observables#

Specification: `params` and `options`#