:py:mod:`respy.pre_processing.model_processing` =============================================== .. py:module:: respy.pre_processing.model_processing .. autoapi-nested-parse:: Process model specification files or objects. .. !! processed by numpydoc !! Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: respy.pre_processing.model_processing.process_params_and_options respy.pre_processing.model_processing._read_options respy.pre_processing.model_processing._create_internal_seeds_from_user_seeds respy.pre_processing.model_processing._read_params respy.pre_processing.model_processing._parse_parameters respy.pre_processing.model_processing._parse_present_bias_parameter respy.pre_processing.model_processing._parse_exogenous_processes respy.pre_processing.model_processing._parse_observables respy.pre_processing.model_processing._parse_choices respy.pre_processing.model_processing._parse_choice_parameters respy.pre_processing.model_processing._parse_initial_and_max_experience respy.pre_processing.model_processing._parse_shocks respy.pre_processing.model_processing._parse_measurement_errors respy.pre_processing.model_processing._parse_types respy.pre_processing.model_processing._infer_number_of_types respy.pre_processing.model_processing._infer_choices_with_experience respy.pre_processing.model_processing._infer_choices_with_prefix respy.pre_processing.model_processing._parse_lagged_choices respy.pre_processing.model_processing._parse_probabilities_or_logit_coefficients respy.pre_processing.model_processing._parse_observable_or_exog_process_names respy.pre_processing.model_processing._sync_optim_paras_and_options respy.pre_processing.model_processing._add_type_covariates respy.pre_processing.model_processing._add_default_is_inadmissible respy.pre_processing.model_processing._convert_labels_in_formulas_to_codes respy.pre_processing.model_processing._replace_in_single_or_double_quotes respy.pre_processing.model_processing._replace_choices_and_observables_in_formula respy.pre_processing.model_processing._convert_labels_in_filters_to_codes respy.pre_processing.model_processing._parse_cache_directory .. py:function:: process_params_and_options(params, options) Process `params` and `options`. This function is interface for parsing the model specification given by the user. .. !! processed by numpydoc !! .. py:function:: _read_options(dict_or_path) Read the options which can either be a dictionary or a path. .. !! processed by numpydoc !! .. py:function:: _create_internal_seeds_from_user_seeds(options) Create internal seeds from user input. Instead of reusing the same seed, we use sequences of seeds incrementing by one. It ensures that we do not accidentally draw the same randomness twice. As naive sequences started by the seeds given by the user might be overlapping, the user seeds are used to generate seeds within certain ranges. The seed for the - solution is between 1,000,000 and 2,000,000. - simulation is between 4,000,000 and 5,000,000. - likelihood estimation is between 7,000,000 and 8,000,000. Furthermore, we need to sequences of seeds. The first sequence is for building :func:`~respy.simulate.simulate` or :func:`~respy.likelihood.log_like` where `"startup"` seeds are used to generate the draws. The second sequence start at ``seed_start + SEED_STARTUP_ITERATION_GAP`` and has to be reset to the initial value at the beginning of every iteration. See :ref:`randomness-and-reproducibility` for more information. .. rubric:: Examples >>> options = {"solution_seed": 1, "simulation_seed": 2, "estimation_seed": 3} >>> options = _create_internal_seeds_from_user_seeds(options) >>> options["solution_seed_startup"], options["solution_seed_iteration"] (count(1128037), count(2128037)) >>> options["simulation_seed_startup"], options["simulation_seed_iteration"] (count(4875688), count(5875688)) >>> options["estimation_seed_startup"], options["estimation_seed_iteration"] (count(7071530), count(8071530)) .. !! processed by numpydoc !! .. py:function:: _read_params(df_or_series) Read the parameters which can either be a path, a Series, or a DataFrame. .. !! processed by numpydoc !! .. py:function:: _parse_parameters(params, options) Parse the parameter vector into a dictionary of model quantities. .. !! processed by numpydoc !! .. py:function:: _parse_present_bias_parameter(optim_paras, params) Parse present-bias parameter which is 1 by default. .. rubric:: Examples An example where present-bias parameter is specified: >>> tuples = [("beta", "beta")] >>> index = pd.MultiIndex.from_tuples(tuples, names=["category", "name"]) >>> params = pd.Series(data=0.4, index=index) >>> optim_paras = {"delta": 0.95} >>> _parse_present_bias_parameter(optim_paras, params) {'delta': 0.95, 'beta': 0.4, 'beta_delta': 0.38} And one where present-bias parameter is not specified: >>> params = pd.Series(dtype="float64") >>> optim_paras = {"delta": 0.95} >>> _parse_present_bias_parameter(optim_paras, params) {'delta': 0.95, 'beta': 1, 'beta_delta': 0.95} .. !! processed by numpydoc !! .. py:function:: _parse_exogenous_processes(optim_paras, params) Parse exogenous processes. .. !! processed by numpydoc !! .. py:function:: _parse_observables(optim_paras, params) Parse observed variables and their levels. .. !! processed by numpydoc !! .. py:function:: _parse_choices(optim_paras, params, options) Define unique order of choices. This function defines a unique order of choices. Choices can be separated in choices with experience and wage, with experience but without wage and without experience and wage. This distinction is used to create a unique ordering of choices. Within each group, we order alphabetically. .. !! processed by numpydoc !! .. py:function:: _parse_choice_parameters(optim_paras, params) Parse utility parameters for choices. .. !! processed by numpydoc !! .. py:function:: _parse_initial_and_max_experience(optim_paras, params, options) Process initial experience distributions and maximum experience. .. !! processed by numpydoc !! .. py:function:: _parse_shocks(optim_paras, params) Parse the shock parameters and create the Cholesky factor. .. !! processed by numpydoc !! .. py:function:: _parse_measurement_errors(optim_paras, params) Parse the standard deviations of measurement errors. Measurement errors can be provided for all or none choices with wages. Measurement errors for non-wage choices are neglected. `optim_paras["has_meas_error"]` is only False if there are no standard deviations of measurement errors in `params`, not if they are all zero. Otherwise, we would introduce a kink into the likelihood function. .. !! processed by numpydoc !! .. py:function:: _parse_types(optim_paras, params) Parse type shifts and type parameters. It is not explicitly enforced that all types have the same covariates, but it is implicitly enforced that the parameters form a valid matrix. .. !! processed by numpydoc !! .. py:function:: _infer_number_of_types(params) Infer the number of types from parameters which is zero by default. .. rubric:: Examples An example without types: >>> tuples = [("wage_a", "constant"), ("nonpec_edu", "exp_edu")] >>> index = pd.MultiIndex.from_tuples(tuples, names=["category", "name"]) >>> s = pd.Series(index=index, dtype="object") >>> _infer_number_of_types(s) 1 And one with types: >>> tuples = [("wage_a", "type_3"), ("nonpec_edu", "type_2")] >>> index = pd.MultiIndex.from_tuples(tuples, names=["category", "name"]) >>> s = pd.Series(index=index, dtype="object") >>> _infer_number_of_types(s) 4 .. !! processed by numpydoc !! .. py:function:: _infer_choices_with_experience(params, options) Infer choices with experiences. .. rubric:: Examples >>> options = {"covariates": {"a": "exp_white_collar + exp_a", "b": "exp_b >= 2"}} >>> index = pd.MultiIndex.from_product([["category"], ["a", "b"]]) >>> params = pd.Series(index=index, dtype="object") >>> _infer_choices_with_experience(params, options) ['a', 'b', 'white_collar'] .. !! processed by numpydoc !! .. py:function:: _infer_choices_with_prefix(params, prefix) Infer choices with prefix. .. rubric:: Examples >>> params = pd.Series( ... index=["wage_b", "wage_white_collar", "wage_a", "nonpec_c"], dtype="object" ... ) >>> _infer_choices_with_prefix(params, "wage") ['a', 'b', 'white_collar'] .. !! processed by numpydoc !! .. py:function:: _parse_lagged_choices(optim_paras, options, params) Parse lagged choices from covariates and params. Lagged choices can only influence behavior of individuals through covariates of the utility function. Thus, check the covariates for any patterns like `"lagged_choice_[0-9]+"`. Then, compare the number of lags required by covariates with the information on lagged choices in the parameter specification. For the estimation, there does not have to be any information on lagged choices. For the simulation, we need parameters to define the probability of a choice being the lagged choice. .. warning:: UserWarning If not enough lagged choices are specified in params and the model can only be used for estimation. UserWarning If the model contains superfluous definitions of lagged choices. .. !! processed by numpydoc !! .. py:function:: _parse_probabilities_or_logit_coefficients(params, regex_for_levels) Parse probabilities or logit coefficients of parameter groups. Some parameters form a group to specify a distribution. The parameters can either be probabilities from a probability mass function. For example, see the specification of initial years of schooling in the extended model of Keane and Wolpin (1997). On the other hand, parameters and their corresponding covariates can form the inputs of a :func:`scipy.special.softmax` which generates the probability mass function. This distribution can be more complex. Internally, probabilities are also converted to logit coefficients to align the interfaces. To convert probabilities to the appropriate multinomial logit (softmax) coefficients, use a constant for covariates and note that the sum in the denominator is equal for all probabilities and, thus, can be treated as a constant. The following formula shows that the multinomial coefficients which produce the same probability mass function are equal to the logs of probabilities. .. math:: p_i &= \frac{e^{x_i \beta_i}}{\sum_j e^{x_j \beta_j}} \\ &= \frac{e^{\beta_i}}{\sum_j e^{\beta_j}} \\ log(p_i) &= \beta_i - \log(\sum_j e^{\beta_j}) \\ &= \beta_i - C :Raises: :obj:`ValueError` If probabilities and multinomial logit coefficients are mixed. .. warning:: The user is warned if the discrete probabilities of a probability mass function do not sum to one. .. !! processed by numpydoc !! .. py:function:: _parse_observable_or_exog_process_names(params, keyword) Parse the names of observables or exogenous processes. The function accepts `params` and a `keyword` like `observable` and separates the name of the variables from its possible realizations. :Parameters: **params** : :obj:`pandas.Series` Contains the parameters of a model. **keyword** : {"exogenous_process", "observable"} Keyword for a group of parameters. .. rubric:: Examples >>> index = pd.MultiIndex.from_tuples([ ... ("observable_observable_0_first", "probability"), ... ("observable_observable_0_second", "probability"), ... ("observable_observable_1_first", "probability"), ... ("observable_observable_1_second", "probability"), ... ("observable_children_two_or_less", "probability"), ... ("observable_children_more_than_two", "probability"), ... ], names=["category", "name"]) >>> params = pd.Series(index=index, dtype="object") >>> _parse_observable_or_exog_process_names(params, "observable") ['children', 'observable_0', 'observable_1'] .. !! processed by numpydoc !! .. py:function:: _sync_optim_paras_and_options(optim_paras, options) Sync ``optim_paras`` and ``options`` after they have been parsed separately. .. !! processed by numpydoc !! .. py:function:: _add_type_covariates(options, optim_paras) Add type covariates. Since types only introduce constant shifts in the utility functions, this function conveniently adds covariates for each type by default. .. rubric:: Examples >>> options = {"covariates": {}} >>> optim_paras = {"n_types": 2} >>> _add_type_covariates(options, optim_paras) {'covariates': {'type_1': 'type == 1'}} .. !! processed by numpydoc !! .. py:function:: _add_default_is_inadmissible(options, optim_paras) Add default negative choice set constraints. This function adds negative choice set conditions based on maximum experience and no constraints for choices without experience. .. !! processed by numpydoc !! .. py:function:: _convert_labels_in_formulas_to_codes(options, optim_paras) Convert labels in covariates, filters and inadmissible formulas to codes. Characteristics with labels are either choices or observables. Choices are ordered as in ``optim_paras["choices"]`` and observables alphabetically. Labels can either be in single or double quote strings which has to be checked. .. !! processed by numpydoc !! .. py:function:: _replace_in_single_or_double_quotes(val, from_, to) Replace a value in a string enclosed in single or double quotes. .. !! processed by numpydoc !! .. py:function:: _replace_choices_and_observables_in_formula(formula, optim_paras) Replace choices and observables in formula. Choices and levels of an observable can have string identifier which are replaced with their codes. .. !! processed by numpydoc !! .. py:function:: _convert_labels_in_filters_to_codes(optim_paras, options) Convert labels in `"core_state_space_filters"` to codes. The filters are used to remove states from the state space which are inadmissible anyway. A filter might look like this:: "lagged_choice_1 == '{choice_w_wage}' and exp_{choice_w_wage} == 0" `{choice_w_wage}` is replaced by the actual choice name whereas `'{choice_w_wage}'` or `"{choice_w_wage}"` is replaced with the internal choice code. .. !! processed by numpydoc !! .. py:function:: _parse_cache_directory(options) Parse the location of the cache. .. !! processed by numpydoc !!