respy.likelihood
Everything related to the estimation with maximum likelihood.
get_crit_func(params, options, df, return_scalar=True, return_comparison_plot_data=False)
get_crit_func
Get the criterion function.
log_like(params, df, base_draws_est, solve, type_covariates, options, return_scalar, return_comparison_plot_data)
log_like
Criterion function for the likelihood maximization.
_internal_log_like_obs(state_space, df, base_draws_est, type_covariates, optim_paras, options)
_internal_log_like_obs
Calculate the likelihood contribution of each individual in the sample.
_compute_wage_and_choice_log_likelihood_contributions(df, base_draws_est, wages, nonpecs, continuation_values, choice_set, optim_paras, options)
_compute_wage_and_choice_log_likelihood_contributions
Compute wage and choice log likelihood contributions.
_compute_log_type_probabilities(df, optim_paras, options)
_compute_log_type_probabilities
Compute the log type probabilities.
_compute_x_beta_for_type_probabilities(df, optim_paras, options)
_compute_x_beta_for_type_probabilities
Compute the vector dot product of type covariates and type coefficients.
_logsumexp(x)
_logsumexp
Compute logsumexp of x.
_simulate_log_probability_of_individuals_observed_choice(wages, nonpec, continuation_values, draws, delta, choice, tau, smoothed_log_probability)
_simulate_log_probability_of_individuals_observed_choice
Simulate the probability of observing the agent’s choice.
_process_estimation_data(df, state_space, optim_paras, options)
_process_estimation_data
Process estimation data.
_update_optim_paras_with_initial_experience_levels(optim_paras, df)
_update_optim_paras_with_initial_experience_levels
Adjust the initial experience levels in optim_paras from the data.
_create_comparison_plot_data(df, log_type_probabilities, optim_paras)
_create_comparison_plot_data
Create DataFrame for estimagic’s comparison plot.
_map_choice_codes_to_indices_of_valid_choice_set(choices, choice_set)
_map_choice_codes_to_indices_of_valid_choice_set
Map choice codes to the indices of the valid choice set.
respy.likelihood.
Return a version of the likelihood functions in respy where all arguments except the parameter vector are fixed with functools.partial(). Thus the function can be directly passed into an optimizer or a function for taking numerical derivatives.
functools.partial()
pandas.DataFrame
DataFrame containing model parameters.
dict
Dictionary containing model options.
The model is fit to this dataset.
False
Indicator for whether the mean log likelihood should be returned or the log likelihood contributions.
Indicator for whether a pandas.DataFrame with various contributions for the visualization with estimagic should be returned.
log_like()
Criterion function where all arguments except the parameter vector are set.
AssertionError
If data has not the expected format.
This function calculates the likelihood contributions of the sample.
pandas.Series
Parameter Series
The DataFrame contains choices, log wages, the indices of the states for the different types.
numpy.ndarray
Set of draws to calculate the probability of observed wages.
solve()
Function which solves the model with new parameters.
Contains model options.
The function calculates all likelihood contributions for all observations in the data which means all individual-period-type combinations.
Then, likelihoods are accumulated within each individual and type over all periods. After that, the result is multiplied with the type-specific shares which yields the contribution to the likelihood for each individual.
StateSpace
Class of state space.
Array with shape (n_periods, n_draws, n_choices) containing i.i.d. draws from standard normal distributions.
None
If the model includes types, this is a pandas.DataFrame containing the covariates to compute the type probabilities.
Dictionary with quantities that were extracted from the parameter vector.
Options of the model.
Array with shape (n_individuals,) containing contributions of individuals in the empirical data.
Contains log wages, choices and
For each individual, compute as many vector dot products as there are types. The scalars are later passed to a softmax function to compute the type probabilities. The probability for each individual to be some type.
The function does the same as the following code, but faster.
log_sum_exp = np.max(x) + np.log(np.sum(np.exp(x - np.max(x))))
The subtraction of the maximum prevents overflows and mitigates the impact of underflows.
The probability is simulated by iterating over a distribution of unobservables. First, the utility of each choice is computed. Then, the probability of observing the choice of the agent given the maximum utility from all choices is computed.
The naive implementation calculates the log probability for choice i with the softmax function.
The following function is numerically more robust. The derivation with the two consecutive logsumexp functions is included in #278.
Array with shape (n_choices,).
Array with shape (n_choices,)
Array with shape (n_draws, n_choices)
float
Discount rate.
int
Choice of the agent.
Smoothing parameter for choice probabilities.
Simulated Smoothed log probability of choice.
All necessary objects for _internal_log_like_obs() dependent on the data are produced.
_internal_log_like_obs()
Some objects have to be repeated for each type which is a desirable format for the estimation where every observations is weighted by type probabilities.
The DataFrame which contains the data used for estimation. The DataFrame contains individual identifiers, periods, experiences, lagged choices, choices in current period, the wage and other observed data.
Indexer for the core state space.
Array with shape (n_observations, n_types) where information is only repeated over the second axis.
Array with shape (n_individuals,) containing indices for the first observations of each individual.
Array with shape (n_observations, n_types) containing indices for states which correspond to observations.
Array with shape (n_observations, n_types) containing clipped log wages.
Array with shape (n_individuals, n_type_covariates) containing covariates to predict probabilities for each type.
Choice codes are numbering all choices going from 0 to n_choices - 1. In some dense indices not all choices are available and, thus, arrays like wages have only as many columns as available choices. Therefore, we need to number the available choices from 0 to n_available_choices - 1 and replace the old choice codes with the new ones.
Examples
>>> wages = np.arange(4).reshape(2, 2) >>> choices = np.array([0, 2]) >>> choice_set = (True, False, True)
>>> np.choose(choices, wages) Traceback (most recent call last): ... ValueError: invalid entry in choice array
>>> new_choices = _map_choice_codes_to_indices_of_valid_choice_set( ... choices, choice_set ... ) >>> np.choose(new_choices, wages) array([0, 3])
respy.interpolate
respy.method_of_simulated_moments