View and download the notebook here!

Covariates#

Covariates are central objects in respy. They can be used to implement complex structures for payoffs, exogenous processes, and choices. Since Eckstein-Keane-Wolpin models deal with dynamic human capital accumulation, the most basic version of any model will include the variables experience, period and lagged_choice. These variables can always be used to define payoffs, determine transition probabilities of exogenous processes, and restrict choices. However, many exconomic applications require a richer set of variables that affect the decision problem of individuals. Additional structure can be imposed on a model using covariates.

This guide provides a short overview of covariates and how they can be used to specify a model in respy.

[1]:
import pandas as pd
import respy as rp

Covariates are defined in the respy options using a nested dictionary. We will look at an example model here and demonstrate how to define covariates. The example model used is a basic Robinson Crusoe model. See the guide below for more information about this example model.

Tutorials Find out more about the basic Robinson Crusoe economy in params, options, and simulation.
[2]:
params, options = rp.get_example_model("robinson_crusoe_basic", with_data=False)
options
[2]:
{'solution_draws': 100,
 'solution_seed': 456,
 'n_periods': 5,
 'simulation_agents': 1000,
 'simulation_seed': 132,
 'estimation_draws': 100,
 'estimation_seed': 100,
 'estimation_tau': 0.001,
 'interpolation_points': -1,
 'covariates': {'constant': '1'}}

In this very basic model, the only defined covariate is constant, which is assigned the constant value of 1. This covariate is then used in params to specify the payoff.

[3]:
params
[3]:
value
category name
delta delta 0.95
wage_fishing exp_fishing 0.30
nonpec_fishing constant -0.20
nonpec_hammock constant 2.00
shocks_sdcorr sd_fishing 0.50
sd_hammock 0.50
corr_hammock_fishing 0.00

Covariates defining payoffs#

The first part of this guide covers the use of covariates in complex payoff structures. As can be derived from the params, the non-pecuniary reward for choosing to relax in the hammock is 1 * 2 = 2. The payoff determined by a covariate is always the value of this covariate times the return, defined in the value column. From this very simple example, it becomes clear that covariates always need to be numbers or boolean variables, which are then treated as 0 and 1. Let us now define three more complex covariates. Note, that we always specify a new covariate using already defined ones.

[4]:
# Robinson gets a bonus for fishing, when he was at least three times fishing:
options["covariates"]["experienced_fisher"] = 'exp_fishing > 2'
[5]:
# Now we can use this covariate to specify the payoff
params.loc[("wage_fishing", "experienced_fisher"), "value"] = 0.1
[6]:
# Now we can use the already existing covariate and define another one on top:
options["covariates"][
    "experienced_fisher_last_period"
] = "experienced_fisher & (lagged_choice_1 == 'fishing')"
[7]:
# Note that when using lagged_choice you either have to specify the value in period 0
# or respy assumes a equiprobable distribution over choices. Here I specified, that all
# individuals had choosen hammock in period -1.
params.loc[("lagged_choice_1_hammock", "constant"), "value"] = 1
[8]:
# Now we can use this covariate to specify the payoff
params.loc[("wage_fishing", "experienced_fisher_last_period"), "value"] = 0.15
[9]:
# Now sort the params DataFrame to group all added payoffs
params.sort_values(by="category")
[9]:
value
category name
delta delta 0.95
lagged_choice_1_hammock constant 1.00
nonpec_fishing constant -0.20
nonpec_hammock constant 2.00
shocks_sdcorr sd_fishing 0.50
sd_hammock 0.50
corr_hammock_fishing 0.00
wage_fishing exp_fishing 0.30
experienced_fisher 0.10
experienced_fisher_last_period 0.15
[10]:
simulate = rp.get_simulate_func(params, options)
df = simulate(params)
df
[10]:
Experience_Fishing Lagged_Choice_1 Shock_Reward_Fishing Meas_Error_Wage_Fishing Shock_Reward_Hammock Meas_Error_Wage_Hammock Dense_Key Core_Index Choice Wage ... Nonpecuniary_Reward_Fishing Wage_Fishing Flow_Utility_Fishing Value_Function_Fishing Continuation_Value_Fishing Nonpecuniary_Reward_Hammock Wage_Hammock Flow_Utility_Hammock Value_Function_Hammock Continuation_Value_Hammock
Identifier Period
0 0 0 hammock -0.035035 1 0.040965 1 0 1 fishing 0.982635 ... -0.2 0.982635 0.782635 10.464014 10.190926 2 NaN 2.020483 9.380417 7.747300
1 1 fishing 0.074254 1 1.506491 1 1 1 fishing 1.400917 ... -0.2 1.400917 1.200917 10.068861 9.334677 2 NaN 2.753245 9.016741 6.593153
2 2 fishing -0.354560 1 1.185316 1 2 2 fishing 1.526107 ... -0.2 1.526107 1.326107 8.785926 7.852441 2 NaN 2.592658 7.637116 5.309956
3 3 fishing -0.109397 1 -0.785877 1 3 3 fishing 2.990083 ... -0.2 2.990083 2.790083 7.282000 4.728334 2 NaN 1.607061 4.574511 3.123631
4 4 fishing -1.063705 1 1.245234 1 4 3 hammock NaN ... -0.2 2.504647 2.304647 2.304647 0.000000 2 NaN 2.622617 2.622617 0.000000
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
999 0 0 hammock 0.584099 1 1.611990 1 0 1 fishing 1.339169 ... -0.2 1.339169 1.139169 10.820549 10.190926 2 NaN 2.805995 10.165930 7.747300
1 1 fishing -0.391274 1 0.371305 1 1 1 fishing 1.110003 ... -0.2 1.110003 0.910003 9.777947 9.334677 2 NaN 2.185652 8.449147 6.593153
2 2 fishing 0.394125 1 -1.448981 1 2 2 fishing 2.219013 ... -0.2 2.219013 2.019013 9.478831 7.852441 2 NaN 1.275510 6.319968 5.309956
3 3 fishing 0.531008 1 -0.312350 1 3 3 fishing 4.118561 ... -0.2 4.118561 3.918561 8.410478 4.728334 2 NaN 1.843825 4.811274 3.123631
4 4 fishing 1.367302 1 1.117095 1 4 3 fishing 8.445646 ... -0.2 8.445646 8.245646 8.245646 0.000000 2 NaN 2.558547 2.558547 0.000000

5000 rows × 22 columns

Covariates defining exogenous process probabilities#

Covariates can also be used to define complex probabilitiy structures in respy. For more information see “Processes with increasing characteristics” in the tutorial for exogenous processes.

Tutorials Find out more about exogenous processes.