respy.conditional_draws#

Everything related to conditional draws for the maximum likelihood estimation.

Module Contents#

Functions#

create_draws_and_log_prob_wages(log_wage_observed, ...)

Evaluate likelihood of observed wages and create conditional draws.

update_mean_and_evaluate_likelihood(log_wage_observed, ...)

Update mean and evaluate likelihood.

update_cholcov_with_measurement_error(shocks_cholesky, ...)

Make a Kalman covariance updated for all possible cases.

update_cholcov(shocks_cholesky, n_wages)

Calculate cholesky factors of conditional covs for all possible cases.

calculate_conditional_draws(base_draws, updated_mean, ...)

Calculate the conditional draws from base draws, updated means and updated chols.

make_cholesky_unique(chol)

Make a lower triangular cholesky factor unique.

respy.conditional_draws.create_draws_and_log_prob_wages(log_wage_observed, wages_systematic, base_draws, choices, shocks_cholesky, n_wages, meas_sds, has_meas_error)[source]#

Evaluate likelihood of observed wages and create conditional draws.

Let n_obs be the number of period-individual combinations, i.e. the number of rows of the empirical dataset.

Parameters:
log_wage_observednumpy.ndarray

Array with shape (n_obs * n_types,) containing observed log wages.

wages_systematicnumpy.ndarray

Array with shape (n_obs * n_types, n_choices) containing systematic wages. Can contain numpy.nan or any number for non-wage choices. The non-wage choices only have to be there to not raise index errors.

base_drawsnumpy.ndarray

Array with shape (n_draws, n_choices) with standard normal random variables.

choicesnumpy.ndarray

Array with shape (n_obs * n_types,) containing observed choices. Is used to select columns of systematic wages. Therefore it has to be coded starting at zero.

shocks_choleskynumpy.ndarray

Array with shape (n_choices, n_choices) with the lower triangular Cholesky factor of the covariance matrix of the shocks.

n_wagesint

Number of wage sectors

meas_sdsnumpy.ndarray

Array with shape (n_choices,) containing standard deviations of the measurement errors of observed reward components. It is 0 for choices where no reward component is observed.

has_meas_errorbool
Returns:
drawsnumpy.ndarray

Array with shape (n_obs * n_types, n_draws, n_choices) containing shocks drawn from a multivariate normal distribution conditional on the observed wages.

log_prob_wagesnumpy.ndarray

Array with shape (n_obs * n_types,) containing the unconditional log likelihood of the observed wages, correcting for measurement error if necessary.

respy.conditional_draws.update_mean_and_evaluate_likelihood(log_wage_observed, log_wage_systematic, cov, choice, meas_sds, updated_mean, loglike)[source]#

Update mean and evaluate likelihood.

Calculate the conditional mean of shocks after observing one shock and evaluate the likelihood of the observed shock.

The mean is updated by the “Sequences of Conditional Distributions” explained in [1]. Consider the following sequence of correlated normal random variables whose mean is adapted by the following formula:

\[\begin{split}X_1 &\sim \mathcal{N}(0, \sigma_{11}) \\ X_2 &\sim \mathcal{N}( \sigma_{12} \frac{X_1}{\sigma_{11}}, \sigma_{22} - \frac{\sigma^2_{12}}{\sigma_{11}} ) \\ \dots\end{split}\]

For the probability of the observed wage, recognize that wages are log-normally distributed. Thus, the following formula applies [2] [3]:

\[f_W(w_{it}) = \frac{1}{w_{it}} \cdot \frac{1}{\sigma \sqrt{2 \pi}} \exp \left( - \frac{(\ln(w_{it}) - \ln(w(s^-_t, a_t)))^2}{2 \sigma^2} \right)\]

where \(i\) is the individual, \(t\) is the period, \(f_W\) is the probability density function of the wage, \(w_{it}\) is the observed wage, \(\sigma\) is the standard deviation of the wage shock, \(s^-_t\) is the state without the shocks, \(a_t\) is the choice and \(w(s^-_t, a_t)\) is the non-stochastic wage implied by the model for choice \(a_t\).

Parameters:
log_wage_observedfloat

Log of the observed wage of the individual. Can be np.nan if no wage was observed for a working alternative or the individual chose a non-working alternative.

log_wage_systematicfloat

Log of the implied wage for the choice by the model. This term is computed by the wage equation without the choice-specific shock.

covnumpy.ndarray

Unconditional covariance matrix of the shocks.

choiceint

The observed choice.

meas_sdsnumpy.ndarray

Array with shape (n_choices,) containing standard errors of measurement errors.

Returns:
updated_meannumpy.ndarray

Conditional mean of shocks, given the observed shock. Contains the observed shock in the corresponding position even in the degenerate case of no measurement error. Has length n_choices.

loglikefloat

log likelihood of observing the observed shock. 0 if no shock was observed.

References

[1]

Gentle, J. E. (2009). Computational statistics (Vol. 308). New York: Springer.

[2]

Johnson, Norman L.; Kotz, Samuel; Balakrishnan, N. (1994), “14: Lognormal Distributions”, Continuous univariate distributions. Vol. 1, Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics (2nd ed.)

[3]

Keane, M. P., Wolpin, K. I., & Todd, P. (2011). Handbook of Labor Economics, volume 4, chapter The Structural Estimation of Behavioral Models: Discrete Choice Dynamic Programming Methods and Applications.

respy.conditional_draws.update_cholcov_with_measurement_error(shocks_cholesky, meas_sds, n_wages)[source]#

Make a Kalman covariance updated for all possible cases.

We use a square-root implementation of the Kalman filter to avoid taking any Cholesky decompositions which could fail due to numerical error.

Parameters:
shocks_choleskynumpy.ndarray

Cholesky factor of the covariance matrix before updating. Has dimension (n_choices, n_choices).

meas_sds: numpy.ndarray

The standard deviations of the measurement errors. Has length n_wages.

n_wagesint

Number of wage sectors.

Returns:
updated_cholsnumpy.ndarray

Array of (shape n_wages + 1, n_choices, n_choices) with the cholesky factors of the updated covariance matrices for each possible observed shock. The last element corresponds to not observing any shock.

References

[1]

Robert Grover Brown. Introduction to Random Signals and Applied Kalman Filtering. Wiley and sons, 2012.

respy.conditional_draws.update_cholcov(shocks_cholesky, n_wages)[source]#

Calculate cholesky factors of conditional covs for all possible cases.

Parameters:
shocks_choleskynumpy.ndarray

cholesky factor of the covariance matrix before updating. Has dimension (n_choices, n_choices)

n_wagesint

Number of wage sectors.

Returns:
updated_cholsnumpy.ndarray

Array of (shape n_wages + 1, n_choices, n_choices) with the cholesky factors of the updated covariance matrices for each possible observed shock. The last element corresponds to not observing any shock.

respy.conditional_draws.calculate_conditional_draws(base_draws, updated_mean, updated_chols, chol_index, max_log_float, conditional_draw)[source]#

Calculate the conditional draws from base draws, updated means and updated chols.

We need to pass max_log_float to the function, because the global variables MAX_LOG_FLOAT cannot be used directly within the guvectorize.

Parameters:
base_drawsnp.ndarray

iid standard normal draws

updated_meannp.ndarray

conditional mean, given the observed shock. Contains the observed shock in the corresponding position.

updated_cholsnp.ndarray

cholesky factor of conditional covariance, given the observed shock. If there is no measurement error, it contains a zero column and row at the position of the observed shock.

chol_indexfloat

index of the relevant updated cholesky factor

max_log_floatfloat

Value at which numbers soon to be exponentiated are clipped.

Returns:
conditional drawsnp.ndarray

draws from the conditional distribution of the shocks.

respy.conditional_draws.make_cholesky_unique(chol)[source]#

Make a lower triangular cholesky factor unique.

Cholesky factors are only unique with the additional requirement that all diagonal elements are positive. This is done automatically by np.linalg.cholesky. Since we calculate cholesky factors by QR decompositions we have to do it manually.

It is obvious from that this is admissible because:

chol sign_swither sign_switcher.T chol.T = chol chol.T