respy.interpolate#

This module contains the code for approximate solutions to the DCDP.

Module Contents#

Functions#

kw_94_interpolation(state_space, ...)

Calculate the approximate solution proposed by [1].

_get_seeds_to_create_not_interpolate_indicator(...)

Get seeds for each dense index to mask not interpolated states.

_split_interpolation_points_evenly(...)

Split the number of interpolated states evenly across dense dimensions.

_get_not_interpolated_indicator(interpolation_points, ...)

Get indicator for states which will be not interpolated.

_compute_expected_shocks(...)

Compute an array with the expected value of the shocks.

_compute_rhs_variables(wages, nonpec, ...)

Compute right-hand side variables of the linear model.

_compute_lhs_variable(wages, nonpec, ...)

Calculate left-hand side variable for all states which are not interpolated.

_predict_with_linear_model(endogenous, exogenous, ...)

Predict the expected value function for interpolated states with a linear model.

ols(y, x)

Calculate the coefficients of a linear model with OLS using a pseudo-inverse.

respy.interpolate.kw_94_interpolation(state_space, period_draws_emax_risk, period, optim_paras, options)[source]#

Calculate the approximate solution proposed by [1].

The authors propose an interpolation method to alleviate the computation burden of the full solution. The full solution calculates the expected value function with Monte-Carlo simulation for each state in the state space for a pre-defined number of points. Both, the number of states and points, have a huge impact on runtime.

[1] propose an interpolation method to alleviate the computation burden. The general idea is to calculate the expected value function with Monte-Carlo simulation only for a much smaller number of states and predict the remaining expected value functions with a linear model. The linear model is

\[EVF - MaxeVF = \pi_0 + \sum^{n-1}_{i=0} \pi_{1i} (MaxeVF - eVF_i) + \sum^{n-1}_{j=0} \pi_{2j} \sqrt{MaxeVF - eVF_j}\]

where \(EVF\) are the expected value functions generated by the Monte-Carlo simulation, \(eVF_i\) are the value functions generated with the expected value of the shocks, and \(MaxeVF\) is their maximum over all \(i\).

The expected value of the shocks is zero for non-working alternatives. For working alternatives, the shocks are log normally distributed and cannot be set to zero, but \(E(X) = \exp\{\mu + \frac{\sigma^2}{2}\}\) where \(\mu = 0\).

After experimenting with various functions for \(g()\), the authors include simple differences and the square root of the simple differences in the equation.

The function consists of the following steps.

  1. Create an indicator for whether the expected value function of the state is calculated with Monte-Carlo simulation or interpolation.

  2. Compute the expected value of the shocks.

  3. Compute the right-hand side variables of the linear model.

  4. Compute the left-hand side variables of the linear model by Monte-Carlo simulation on subset of states.

  5. Fit the linear model with ordinary least squares on the subset without interpolation and predict the expected value functions for all other states.

References

[1] (1,2,3)

Keane, M. P. and Wolpin, K. I. (1994). The Solution and Estimation of Discrete Choice Dynamic Programming Models by Simulation and Interpolation: Monte Carlo Evidence. The Review of Economics and Statistics, 76(4): 648-672.

respy.interpolate._get_seeds_to_create_not_interpolate_indicator(dense_keys, options)[source]#

Get seeds for each dense index to mask not interpolated states.

respy.interpolate._split_interpolation_points_evenly(dense_key_to_n_states, period, options)[source]#

Split the number of interpolated states evenly across dense dimensions.

We want to distribute the interpolation points evenly across dense indices in the state space. Thus, we draw the dense indices until we reach the total number of interpolation points and count the indices. The probability for each dense index being drawn is the its share of the total number of states in the period.

Parameters:
dense_index_to_n_statesdict

Dictionary whose keys are dense indices in the period and values are the number of states.

periodint

The current period. Used to print a more informative warning.

optionsdict

Model options.

Warning

UserWarning

If the number of interpolation points is below 1% for one dense_index.

respy.interpolate._get_not_interpolated_indicator(interpolation_points, n_states, seed)[source]#

Get indicator for states which will be not interpolated.

Parameters:
interpolation_pointsint

Number of states which will be interpolated.

n_statesint

Total number of states in period.

seedint

Seed to set randomness.

Returns:
not_interpolatednumpy.ndarray

Array of shape (n_states,) indicating states which will not be interpolated.

respy.interpolate._compute_expected_shocks(dense_key_to_choice_set_in_period, optim_paras)[source]#

Compute an array with the expected value of the shocks.

respy.interpolate._compute_rhs_variables(wages, nonpec, continuation_values, draws, delta)[source]#

Compute right-hand side variables of the linear model.

Constructing the exogenous variable for all states, including the ones where simulation will take place. All information will be used in either the construction of the prediction model or the prediction step.

Parameters:
wagesnumpy.ndarray

Array with shape (n_states_in_period, n_choices).

nonpecnumpy.ndarray

Array with shape (n_states_in_period, n_choices).

continuation_valuesnumpy.ndarray

Array with shape (n_states_in_period, n_choices).

drawsnumpy.ndarray

Array with shape (n_choices,).

deltafloat

Discount factor.

Returns:
exogenousnumpy.ndarray

Array with shape (n_states_in_period, n_choices * 2 + 1) where the last column contains the constant.

max_value_functionsnumpy.ndarray

Array with shape (n_states_in_period,) containing maximum over all value functions computed with the expected value of shocks.

respy.interpolate._compute_lhs_variable(wages, nonpec, continuation_values, max_value_functions, not_interpolated, draws, delta)[source]#

Calculate left-hand side variable for all states which are not interpolated.

The function computes the full solution for a subset of states. Then, the dependent variable is the expected value function minus the maximum of value function with the expected shocks.

Parameters:
wagesnumpy.ndarray

Array with shape (n_states_in_period, n_choices).

nonpecnumpy.ndarray

Array with shape (n_states_in_period, n_choices).

continuation_valuesnumpy.ndarray

Array with shape (n_states_in_period, n_choices).

max_value_functionsnumpy.ndarray

Array with shape (n_states_in_period,) containing maximum over all value functions computed with the expected value of shocks.

not_interpolatednumpy.ndarray

Array with shape (n_states_in_period,) containing indicators for simulated continuation_values.

drawsnumpy.ndarray

Array with shape (n_draws, n_choices) containing draws.

deltafloat

Discount factor.

respy.interpolate._predict_with_linear_model(endogenous, exogenous, max_value_functions, not_interpolated)[source]#

Predict the expected value function for interpolated states with a linear model.

The linear model is fitted with ordinary least squares. Then, predict the expected value function for all interpolated states and use the compute expected value functions for the remaining states.

Parameters:
endogenousnumpy.ndarray

Array with shape (num_simulated_states_in_period,) containing the expected value functions minus the maximufor states used to interpolate the rest.

exogenousnumpy.ndarray

Array with shape (n_states_in_period, n_choices * 2 + 1) containing exogenous variables.

max_value_functionsnumpy.ndarray

Array with shape (n_states_in_period,) containing maximum over all value functions computed with the expected value of shocks.

not_interpolatednumpy.ndarray

Array with shape (n_states_in_period,) containing indicator for states which are not interpolated and used to estimate the coefficients for the interpolation.

respy.interpolate.ols(y, x)[source]#

Calculate the coefficients of a linear model with OLS using a pseudo-inverse.

Parameters:
xnumpy.ndarray

Array with shape (n_observations, n_independent_variables) containing the independent variables.

ynumpy.ndarray

Array with shape (n_observations,) containing the dependent variable.

Returns:
betanumpy.ndarray

Array with shape (n_independent_variables,) containing the coefficients of the linear model.