:py:mod:`respy.pre_processing.model_processing`
===============================================

.. py:module:: respy.pre_processing.model_processing

.. autoapi-nested-parse::

   Process model specification files or objects.

   ..
       !! processed by numpydoc !!


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   respy.pre_processing.model_processing.process_params_and_options
   respy.pre_processing.model_processing._read_options
   respy.pre_processing.model_processing._create_internal_seeds_from_user_seeds
   respy.pre_processing.model_processing._read_params
   respy.pre_processing.model_processing._parse_parameters
   respy.pre_processing.model_processing._parse_present_bias_parameter
   respy.pre_processing.model_processing._parse_exogenous_processes
   respy.pre_processing.model_processing._parse_observables
   respy.pre_processing.model_processing._parse_choices
   respy.pre_processing.model_processing._parse_choice_parameters
   respy.pre_processing.model_processing._parse_initial_and_max_experience
   respy.pre_processing.model_processing._parse_shocks
   respy.pre_processing.model_processing._parse_measurement_errors
   respy.pre_processing.model_processing._parse_types
   respy.pre_processing.model_processing._infer_number_of_types
   respy.pre_processing.model_processing._infer_choices_with_experience
   respy.pre_processing.model_processing._infer_choices_with_prefix
   respy.pre_processing.model_processing._parse_lagged_choices
   respy.pre_processing.model_processing._parse_probabilities_or_logit_coefficients
   respy.pre_processing.model_processing._parse_observable_or_exog_process_names
   respy.pre_processing.model_processing._sync_optim_paras_and_options
   respy.pre_processing.model_processing._add_type_covariates
   respy.pre_processing.model_processing._add_default_is_inadmissible
   respy.pre_processing.model_processing._convert_labels_in_formulas_to_codes
   respy.pre_processing.model_processing._replace_in_single_or_double_quotes
   respy.pre_processing.model_processing._replace_choices_and_observables_in_formula
   respy.pre_processing.model_processing._convert_labels_in_filters_to_codes
   respy.pre_processing.model_processing._parse_cache_directory


.. py:function:: process_params_and_options(params, options)

   
   Process `params` and `options`.

   This function is interface for parsing the model specification given by the user.


   ..
       !! processed by numpydoc !!

.. py:function:: _read_options(dict_or_path)

   
   Read the options which can either be a dictionary or a path.


   ..
       !! processed by numpydoc !!

.. py:function:: _create_internal_seeds_from_user_seeds(options)

   
   Create internal seeds from user input.

   Instead of reusing the same seed, we use sequences of seeds incrementing by one. It
   ensures that we do not accidentally draw the same randomness twice.

   As naive sequences started by the seeds given by the user might be overlapping,
   the user seeds are used to generate seeds within certain ranges. The seed for the

   - solution is between 1,000,000 and 2,000,000.
   - simulation is between 4,000,000 and 5,000,000.
   - likelihood estimation is between 7,000,000 and 8,000,000.

   Furthermore, we need to sequences of seeds. The first sequence is for building
   :func:`~respy.simulate.simulate` or :func:`~respy.likelihood.log_like` where
   `"startup"` seeds are used to generate the draws. The second sequence start at
   ``seed_start + SEED_STARTUP_ITERATION_GAP`` and has to be reset to the initial
   value at the beginning of every iteration.

   See :ref:`randomness-and-reproducibility` for more information.


   .. rubric:: Examples

   >>> options = {"solution_seed": 1, "simulation_seed": 2, "estimation_seed": 3}
   >>> options = _create_internal_seeds_from_user_seeds(options)
   >>> options["solution_seed_startup"], options["solution_seed_iteration"]
   (count(1128037), count(2128037))
   >>> options["simulation_seed_startup"], options["simulation_seed_iteration"]
   (count(4875688), count(5875688))
   >>> options["estimation_seed_startup"], options["estimation_seed_iteration"]
   (count(7071530), count(8071530))


   ..
       !! processed by numpydoc !!

.. py:function:: _read_params(df_or_series)

   
   Read the parameters which can either be a path, a Series, or a DataFrame.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_parameters(params, options)

   
   Parse the parameter vector into a dictionary of model quantities.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_present_bias_parameter(optim_paras, params)

   
   Parse present-bias parameter which is 1 by default.


   .. rubric:: Examples

   An example where present-bias parameter is specified:

   >>> tuples = [("beta", "beta")]
   >>> index = pd.MultiIndex.from_tuples(tuples, names=["category", "name"])
   >>> params = pd.Series(data=0.4, index=index)
   >>> optim_paras = {"delta": 0.95}
   >>> _parse_present_bias_parameter(optim_paras, params)
   {'delta': 0.95, 'beta': 0.4, 'beta_delta': 0.38}

   And one where present-bias parameter is not specified:

   >>> params = pd.Series(dtype="float64")
   >>> optim_paras = {"delta": 0.95}
   >>> _parse_present_bias_parameter(optim_paras, params)
   {'delta': 0.95, 'beta': 1, 'beta_delta': 0.95}


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_exogenous_processes(optim_paras, params)

   
   Parse exogenous processes.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_observables(optim_paras, params)

   
   Parse observed variables and their levels.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_choices(optim_paras, params, options)

   
   Define unique order of choices.

   This function defines a unique order of choices. Choices can be separated in choices
   with experience and wage, with experience but without wage and without experience
   and wage. This distinction is used to create a unique ordering of choices. Within
   each group, we order alphabetically.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_choice_parameters(optim_paras, params)

   
   Parse utility parameters for choices.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_initial_and_max_experience(optim_paras, params, options)

   
   Process initial experience distributions and maximum experience.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_shocks(optim_paras, params)

   
   Parse the shock parameters and create the Cholesky factor.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_measurement_errors(optim_paras, params)

   
   Parse the standard deviations of measurement errors.

   Measurement errors can be provided for all or none choices with wages. Measurement
   errors for non-wage choices are neglected.

   `optim_paras["has_meas_error"]` is only False if there are no standard deviations of
   measurement errors in `params`, not if they are all zero. Otherwise, we would
   introduce a kink into the likelihood function.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_types(optim_paras, params)

   
   Parse type shifts and type parameters.

   It is not explicitly enforced that all types have the same covariates, but it is
   implicitly enforced that the parameters form a valid matrix.


   ..
       !! processed by numpydoc !!

.. py:function:: _infer_number_of_types(params)

   
   Infer the number of types from parameters which is zero by default.


   .. rubric:: Examples

   An example without types:

   >>> tuples = [("wage_a", "constant"), ("nonpec_edu", "exp_edu")]
   >>> index = pd.MultiIndex.from_tuples(tuples, names=["category", "name"])
   >>> s = pd.Series(index=index, dtype="object")
   >>> _infer_number_of_types(s)
   1

   And one with types:

   >>> tuples = [("wage_a", "type_3"), ("nonpec_edu", "type_2")]
   >>> index = pd.MultiIndex.from_tuples(tuples, names=["category", "name"])
   >>> s = pd.Series(index=index, dtype="object")
   >>> _infer_number_of_types(s)
   4


   ..
       !! processed by numpydoc !!

.. py:function:: _infer_choices_with_experience(params, options)

   
   Infer choices with experiences.


   .. rubric:: Examples

   >>> options = {"covariates": {"a": "exp_white_collar + exp_a", "b": "exp_b >= 2"}}
   >>> index = pd.MultiIndex.from_product([["category"], ["a", "b"]])
   >>> params = pd.Series(index=index, dtype="object")
   >>> _infer_choices_with_experience(params, options)
   ['a', 'b', 'white_collar']


   ..
       !! processed by numpydoc !!

.. py:function:: _infer_choices_with_prefix(params, prefix)

   
   Infer choices with prefix.


   .. rubric:: Examples

   >>> params = pd.Series(
   ...     index=["wage_b", "wage_white_collar", "wage_a", "nonpec_c"], dtype="object"
   ... )
   >>> _infer_choices_with_prefix(params, "wage")
   ['a', 'b', 'white_collar']


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_lagged_choices(optim_paras, options, params)

   
   Parse lagged choices from covariates and params.

   Lagged choices can only influence behavior of individuals through covariates of the
   utility function. Thus, check the covariates for any patterns like
   `"lagged_choice_[0-9]+"`.

   Then, compare the number of lags required by covariates with the information on
   lagged choices in the parameter specification. For the estimation, there does not
   have to be any information on lagged choices. For the simulation, we need parameters
   to define the probability of a choice being the lagged choice.


   .. warning::

       UserWarning
           If not enough lagged choices are specified in params and the model can only be
           used for estimation.
       UserWarning
           If the model contains superfluous definitions of lagged choices.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_probabilities_or_logit_coefficients(params, regex_for_levels)

   
   Parse probabilities or logit coefficients of parameter groups.

   Some parameters form a group to specify a distribution. The parameters can either be
   probabilities from a probability mass function. For example, see the specification
   of initial years of schooling in the extended model of Keane and Wolpin (1997).

   On the other hand, parameters and their corresponding covariates can form the inputs
   of a :func:`scipy.special.softmax` which generates the probability mass function.
   This distribution can be more complex.

   Internally, probabilities are also converted to logit coefficients to align the
   interfaces. To convert probabilities to the appropriate multinomial logit (softmax)
   coefficients, use a constant for covariates and note that the sum in the denominator
   is equal for all probabilities and, thus, can be treated as a constant. The
   following formula shows that the multinomial coefficients which produce the same
   probability mass function are equal to the logs of probabilities.

   .. math::

       p_i      &= \frac{e^{x_i \beta_i}}{\sum_j e^{x_j \beta_j}} \\
                &= \frac{e^{\beta_i}}{\sum_j e^{\beta_j}} \\
       log(p_i) &= \beta_i - \log(\sum_j e^{\beta_j}) \\
                &= \beta_i - C


   :Raises:

       :obj:`ValueError`
           If probabilities and multinomial logit coefficients are mixed.


   .. warning::

       The user is warned if the discrete probabilities of a probability mass function do
       not sum to one.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_observable_or_exog_process_names(params, keyword)

   
   Parse the names of observables or exogenous processes.

   The function accepts `params` and a `keyword` like `observable`
   and separates the name of the variables from its possible realizations.

   :Parameters:

       **params** : :obj:`pandas.Series`
           Contains the parameters of a model.

       **keyword** : {"exogenous_process", "observable"}
           Keyword for a group of parameters.


   .. rubric:: Examples

   >>> index = pd.MultiIndex.from_tuples([
   ...     ("observable_observable_0_first", "probability"),
   ...     ("observable_observable_0_second", "probability"),
   ...     ("observable_observable_1_first", "probability"),
   ...     ("observable_observable_1_second", "probability"),
   ...     ("observable_children_two_or_less", "probability"),
   ...     ("observable_children_more_than_two", "probability"),
   ... ], names=["category", "name"])
   >>> params = pd.Series(index=index, dtype="object")
   >>> _parse_observable_or_exog_process_names(params, "observable")
   ['children', 'observable_0', 'observable_1']


   ..
       !! processed by numpydoc !!

.. py:function:: _sync_optim_paras_and_options(optim_paras, options)

   
   Sync ``optim_paras`` and ``options`` after they have been parsed separately.


   ..
       !! processed by numpydoc !!

.. py:function:: _add_type_covariates(options, optim_paras)

   
   Add type covariates.

   Since types only introduce constant shifts in the utility functions, this function
   conveniently adds covariates for each type by default.


   .. rubric:: Examples

   >>> options = {"covariates": {}}
   >>> optim_paras = {"n_types": 2}
   >>> _add_type_covariates(options, optim_paras)
   {'covariates': {'type_1': 'type == 1'}}


   ..
       !! processed by numpydoc !!

.. py:function:: _add_default_is_inadmissible(options, optim_paras)

   
   Add default negative choice set constraints.

   This function adds negative choice set conditions based on maximum experience and no
   constraints for choices without experience.


   ..
       !! processed by numpydoc !!

.. py:function:: _convert_labels_in_formulas_to_codes(options, optim_paras)

   
   Convert labels in covariates, filters and inadmissible formulas to codes.

   Characteristics with labels are either choices or observables. Choices are ordered
   as in ``optim_paras["choices"]`` and observables alphabetically.

   Labels can either be in single or double quote strings which has to be checked.


   ..
       !! processed by numpydoc !!

.. py:function:: _replace_in_single_or_double_quotes(val, from_, to)

   
   Replace a value in a string enclosed in single or double quotes.


   ..
       !! processed by numpydoc !!

.. py:function:: _replace_choices_and_observables_in_formula(formula, optim_paras)

   
   Replace choices and observables in formula.

   Choices and levels of an observable can have string identifier which are replaced
   with their codes.


   ..
       !! processed by numpydoc !!

.. py:function:: _convert_labels_in_filters_to_codes(optim_paras, options)

   
   Convert labels in `"core_state_space_filters"` to codes.

   The filters are used to remove states from the state space which are inadmissible
   anyway.

   A filter might look like this::

       "lagged_choice_1 == '{choice_w_wage}' and exp_{choice_w_wage} == 0"

   `{choice_w_wage}` is replaced by the actual choice name whereas `'{choice_w_wage}'`
   or `"{choice_w_wage}"` is replaced with the internal choice code.


   ..
       !! processed by numpydoc !!

.. py:function:: _parse_cache_directory(options)

   
   Parse the location of the cache.


   ..
       !! processed by numpydoc !!