:py:mod:`respy.shared`
======================

.. py:module:: respy.shared

.. autoapi-nested-parse::

   Contains functions which are shared across other modules.

   This module should only import from other packages or modules of respy which also do not
   import from respy itself. This is to prevent circular imports.

   ..
       !! processed by numpydoc !!


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   respy.shared.aggregate_keane_wolpin_utility
   respy.shared.create_base_draws
   respy.shared.transform_base_draws_with_cholesky_factor
   respy.shared.generate_column_dtype_dict_for_estimation
   respy.shared.downcast_to_smallest_dtype
   respy.shared.compute_covariates
   respy.shared.convert_labeled_variables_to_codes
   respy.shared.rename_labels_to_internal
   respy.shared.rename_labels_from_internal
   respy.shared.normalize_probabilities
   respy.shared.calculate_value_functions_and_flow_utilities
   respy.shared.create_core_state_space_columns
   respy.shared.create_dense_state_space_columns
   respy.shared.create_dense_choice_state_space_columns
   respy.shared.create_state_space_columns
   respy.shared.calculate_expected_value_functions
   respy.shared.convert_dictionary_keys_to_dense_indices
   respy.shared.subset_cholesky_factor_to_choice_set
   respy.shared.return_core_dense_key
   respy.shared.pandas_dot
   respy.shared.map_observations_to_states
   respy.shared.map_states_to_core_key_and_core_index
   respy.shared._map_observations_to_dense_index
   respy.shared.dump_objects
   respy.shared.load_objects
   respy.shared._create_file_name_from_complex_index
   respy.shared.prepare_cache_directory
   respy.shared.select_valid_choices
   respy.shared.apply_law_of_motion_for_core
   respy.shared.get_choice_set_from_complex
   respy.shared.get_exogenous_from_dense_covariates


.. py:function:: aggregate_keane_wolpin_utility(wage, nonpec, continuation_value, draw, delta)

   
   Calculate the utility of Keane and Wolpin models.

   Note that the function works for working and non-working alternatives as wages are
   set to one for non-working alternatives such that the draws enter the utility
   function additively.

   :Parameters:

       **wage** : :class:`python:float`
           Value of the wage component. Note that for non-working alternatives this value
           is actually zero, but to simplify computations it is set to one.

       **nonpec** : :class:`python:float`
           Value of the non-pecuniary component.

       **continuation_value** : :class:`python:float`
           Value of the continuation value which is the expected present-value of the
           following state.

       **draw** : :class:`python:float`
           The shock which enters the enters the reward of working alternatives
           multiplicatively and of non-working alternatives additively.

       **delta** : :class:`python:float`
           The discount factor to calculate the present value of continuation values.

   :Returns:

       **alternative_specific_value_function** : :class:`python:float`
           The expected present value of an alternative.

       **flow_utility** : :class:`python:float`
           The immediate reward of an alternative.


   ..
       !! processed by numpydoc !!

.. py:function:: create_base_draws(shape, seed, monte_carlo_sequence)

   
   Create a set of draws from the standard normal distribution.

   The draws are either drawn randomly or from quasi-random low-discrepancy sequences,
   i.e., Sobol or Halton.

   `"random"` is used to draw random standard normal shocks for the Monte Carlo
   integrations or because individuals face random shocks in the simulation.

   `"halton"` or `"sobol"` can be used to change the sequence for two Monte Carlo
   integrations. First, the calculation of the expected value function (EMAX) in the
   solution and the choice probabilities in the maximum likelihood estimation.

   For the solution and estimation it is necessary to have the same randomness in every
   iteration. Otherwise, there is chatter in the simulation, i.e. a difference in
   simulated values not only due to different parameters but also due to draws (see
   10.5 in [R458a0fae971e-1]_). At the same time, the variance-covariance matrix of the shocks is
   estimated along all other parameters and changes every iteration. Thus, instead of
   sampling draws from a varying multivariate normal distribution, standard normal
   draws are sampled here and transformed to the distribution specified by the
   parameters in :func:`transform_base_draws_with_cholesky_factor`.

   :Parameters:

       **shape** : :class:`python:tuple`\(:class:`python:int`)
           Tuple representing the shape of the resulting array.

       **seed** : :class:`python:int`
           Seed to control randomness.

       **monte_carlo_sequence** : {"random", "halton", "sobol"}
           Name of the sequence.

   :Returns:

       **draws** : :obj:`numpy.ndarray`
           Array with shape (n_choices, n_draws, n_choices).


   .. seealso::

       
       :obj:`transform_base_draws_with_cholesky_factor`
           ..
       

   .. rubric:: References

   .. [R458a0fae971e-1] Train, K. (2009). `Discrete Choice Methods with Simulation
          <https://eml.berkeley.edu/books/choice2.html>`_. *Cambridge: Cambridge
          University Press.*
   .. [R458a0fae971e-2] Lemieux, C. (2009). `Monte Carlo and Quasi-Monte Carlo Sampling
           <https://www.springer.com/de/book/9780387781648>`_. *New York: Springer
           Verlag New York.*

   .. only:: latex

      [R458a0fae971e-1]_, [R458a0fae971e-2]_


   ..
       !! processed by numpydoc !!

.. py:function:: transform_base_draws_with_cholesky_factor(draws, choice_set, shocks_cholesky, optim_paras)

   
   Transform standard normal draws with the Cholesky factor.

   The standard normal draws are transformed to normal draws with variance-covariance
   matrix :math:`\Sigma` by multiplication with the Cholesky factor :math:`L` where
   :math:`L^TL = \Sigma`. See chapter 7.4 in [R77891ce50a9f-1]_ for more information.

   This function relates to :func:`create_base_draws` in the sense that it transforms
   the unchanging standard normal draws to the distribution with the
   variance-covariance matrix specified by the parameters.


   .. seealso::

       
       :obj:`create_base_draws`
           ..
       

   .. rubric:: References

   .. [R77891ce50a9f-1] Gentle, J. E. (2009). Computational statistics (Vol. 308). New York:
          Springer.

   .. only:: latex

      [R77891ce50a9f-1]_


   ..
       !! processed by numpydoc !!

.. py:function:: generate_column_dtype_dict_for_estimation(optim_paras)

   
   Generate column labels for data necessary for the estimation.


   ..
       !! processed by numpydoc !!

.. py:function:: downcast_to_smallest_dtype(series, downcast_options=None)

   
   Downcast the dtype of a :class:`pandas.Series` to the lowest possible dtype.

   By default, variables are converted to signed or unsigned integers. Use ``"float"``
   to cast variables from ``float64`` to ``float32``.

   Be aware that NumPy integers silently overflow which is why conversion to low dtypes
   should be done after calculations. For example, using :class:`numpy.uint8` for an
   array and squaring the elements leads to silent overflows for numbers higher than
   255.

   For more information on the dtype boundaries see the NumPy documentation under
   https://docs.scipy.org/doc/numpy-1.17.0/user/basics.types.html.


   ..
       !! processed by numpydoc !!

.. py:function:: compute_covariates(df, definitions, check_nans=False, raise_errors=True)

   
   Compute covariates.

   The function iterates over the definitions of covariates and tries to compute them.
   It keeps track on how many covariates still need to be computed and stops if the
   number does not change anymore. This might be due to missing information.

   :Parameters:

       **df** : :obj:`pandas.DataFrame`
           DataFrame with some, maybe not all state space dimensions like period,
           experiences.

       **definitions** : :class:`python:dict`
           Keys represent covariates and values are strings passed to ``df.eval``.

       **check_nans** : :ref:`bool <python:bltin-boolean-values>`, default :data:`python:False`
           Perform a check whether the variables used to compute the selected covariate do
           not contain any `np.nan`. This is necessary in
           :func:`respy.simulate._sample_characteristic` where some characteristics may
           contain missings.

       **raise_errors** : :ref:`bool <python:bltin-boolean-values>`, default :data:`python:True`
           Whether to raise errors if variables cannot be computed. This option is
           necessary for, e.g., :func:`~respy.simulate._sample_characteristic` where not
           all necessary variables exist and it is not easy to exclude covariates which
           depend on them.

   :Returns:

       **covariates** : :obj:`pandas.DataFrame`
           DataFrame with shape (n_states, n_covariates).


   :Raises:

       :obj:`Exception`
           If variables cannot be computed and ``raise_errors`` is true.


   ..
       !! processed by numpydoc !!

.. py:function:: convert_labeled_variables_to_codes(df, optim_paras)

   
   Convert labeled variables to codes.

   We need to check choice variables and observables for potential labels. The
   mapping from labels to code can be inferred from the order in ``optim_paras``.


   ..
       !! processed by numpydoc !!

.. py:function:: rename_labels_to_internal(x)

   
   Shorten labels and convert them to lower-case.


   ..
       !! processed by numpydoc !!

.. py:function:: rename_labels_from_internal(x)

   
   Shorten labels and convert them to lower-case.


   ..
       !! processed by numpydoc !!

.. py:function:: normalize_probabilities(probabilities)

   
   Normalize probabilities such that their sum equals one.


   .. rubric:: Examples

   The following `probs` do not sum to one after dividing by the sum.

   >>> probs = np.array([0.3775843411510946, 0.5384246942799851, 0.6522988820635421])
   >>> normalize_probabilities(probs)
   array([0.24075906, 0.34331568, 0.41592526])


   ..
       !! processed by numpydoc !!

.. py:function:: calculate_value_functions_and_flow_utilities(wage, nonpec, continuation_value, draw, delta, value_function, flow_utility)

   
   Calculate the choice-specific value functions and flow utilities.

   To apply :func:`aggregate_keane_wolpin_utility` to arrays with arbitrary dimensions,
   this function uses :func:`numba.guvectorize`. One cannot use :func:`numba.vectorize`
   because it does not support multiple return values.


   .. seealso::

       
       :obj:`aggregate_keane_wolpin_utility`
           ..
       

   ..
       !! processed by numpydoc !!

.. py:function:: create_core_state_space_columns(optim_paras)

   
   Create internal column names for the core state space.


   ..
       !! processed by numpydoc !!

.. py:function:: create_dense_state_space_columns(optim_paras)

   
   Create internal column names for the dense state space.


   ..
       !! processed by numpydoc !!

.. py:function:: create_dense_choice_state_space_columns(optim_paras)

   
   Create internal column names for the dense state space.


   ..
       !! processed by numpydoc !!

.. py:function:: create_state_space_columns(optim_paras)

   
   Create names of state space dimensions excluding the period and identifier.


   ..
       !! processed by numpydoc !!

.. py:function:: calculate_expected_value_functions(wages, nonpecs, continuation_values, draws, delta, expected_value_functions)

   
   Calculate the expected maximum of value functions for a set of unobservables.

   The function takes an agent and calculates the utility for each of the choices, the
   ex-post rewards, with multiple draws from the distribution of unobservables and adds
   the discounted expected maximum utility of subsequent periods resulting from
   choices. Averaging over all maximum utilities yields the expected maximum utility of
   this state.

   The underlying process in this function is called `Monte Carlo integration
   <https://en.wikipedia.org/wiki/Monte_Carlo_integration>`_. The goal is to
   approximate an integral by evaluating the integrand at randomly chosen points. In
   this setting, one wants to approximate the m maximum utility of the current
   state.

   Note that ``wages`` have the same length as ``nonpecs`` despite that wages are only
   available in some choices. Missing choices are filled with ones. In the case of a
   choice with wage and without wage, flow utilities are

   .. math::

       \text{Flow Utility} = \text{Wage} * \epsilon + \text{Non-pecuniary}
       \text{Flow Utility} = 1 * \epsilon + \text{Non-pecuniary}

   :Parameters:

       **wages** : :obj:`numpy.ndarray`
           Array with shape (n_choices,) containing wages.

       **nonpecs** : :obj:`numpy.ndarray`
           Array with shape (n_choices,) containing non-pecuniary rewards.

       **continuation_values** : :obj:`numpy.ndarray`
           Array with shape (n_choices,) containing expected maximum utility for each
           choice in the subsequent period.

       **draws** : :obj:`numpy.ndarray`
           Array with shape (n_draws, n_choices).

       **delta** : :class:`python:float`
           The discount factor.

   :Returns:

       **expected_value_functions** : :class:`python:float`
           Expected maximum utility of an agent.


   ..
       !! processed by numpydoc !!

.. py:function:: convert_dictionary_keys_to_dense_indices(dictionary)

   
   Convert the keys to tuples containing integers.


   .. rubric:: Examples

   >>> dictionary = {(0.0, 1): 0, 2: 1}
   >>> convert_dictionary_keys_to_dense_indices(dictionary)
   {(0, 1): 0, (2,): 1}


   ..
       !! processed by numpydoc !!

.. py:function:: subset_cholesky_factor_to_choice_set(cholesky_factor, choice_set)

   
   Subset the Cholesky factor to dimensions required by the admissible choice set.


   .. rubric:: Examples

   >>> m = np.arange(9).reshape(3, 3)
   >>> subset_cholesky_factor_to_choice_set(m, (False, True, False))
   array([[4]])


   ..
       !! processed by numpydoc !!

.. py:function:: return_core_dense_key(core_idx, dense=False)

   
   Return core dense keys in the right format.


   ..
       !! processed by numpydoc !!

.. py:function:: pandas_dot(x, beta, out=None)

   
   Compute the dot product for a DataFrame and a Series.

   The function computes each product in the dot product separately to limit the impact
   of converting a Series to an array.

   To access the NumPy array, `.values` is used instead of `.to_numpy()` because it is
   faster and the latter avoids problems for extension arrays which are not used here.

   :Parameters:

       **x** : :obj:`pandas.DataFrame`
           A DataFrame containing the covariates of the dot product.

       **beta** : :obj:`pandas.Series`
           A Series containing the parameters or coefficients of the dot product.

       **out** : :obj:`numpy.ndarray` or optional
           An output array can be passed to the function which is filled instead of
           allocating a new array.

   :Returns:

       **out** : :obj:`numpy.ndarray`
           Array with shape `len(x)` which contains the solution of the dot product.


   .. rubric:: Examples

   >>> x = pd.DataFrame(np.arange(10).reshape(5, 2), columns=list("ab"))
   >>> beta = pd.Series([1, 2], index=list("ab"))
   >>> x.dot(beta).to_numpy()
   array([ 2,  8, 14, 20, 26]...
   >>> pandas_dot(x, beta)
   array([ 2.,  8., 14., 20., 26.])


   ..
       !! processed by numpydoc !!

.. py:function:: map_observations_to_states(states, state_space, optim_paras)

   
   Map observations in data to states.


   ..
       !! processed by numpydoc !!

.. py:function:: map_states_to_core_key_and_core_index(states, indexer)

   
   Map states to the core key and core index.


   :Parameters:

       **states** : :obj:`numpy.ndarray`
           Multidimensional array containing only core dimensions of states.

       **indexer** : :obj:`numba.typed.Dict`
           A dictionary with core states as keys and the core key and core index as values.

   :Returns:

       **core_key** : :obj:`numpy.ndarray`
           An array containing the core key. See :ref:`core_key`.

       **core_index** : :obj:`numpy.ndarray`
           An array containing the core index. See :ref:`core_indices`.


   ..
       !! processed by numpydoc !!

.. py:function:: _map_observations_to_dense_index(dense, core_index, dense_covariates_to_dense_index, core_key_and_dense_index_to_dense_key)


.. py:function:: dump_objects(objects, topic, complex_, options)

   
   Dump states.


   ..
       !! processed by numpydoc !!

.. py:function:: load_objects(topic, complex_, options)

   
   Load states.


   ..
       !! processed by numpydoc !!

.. py:function:: _create_file_name_from_complex_index(topic, complex_)

   
   Create a file name from a complex index.


   ..
       !! processed by numpydoc !!

.. py:function:: prepare_cache_directory(options)

   
   Prepare cache directory.

   The directory contains the parts of the state space.


   ..
       !! processed by numpydoc !!

.. py:function:: select_valid_choices(choices, choice_set)

   
   Select valid choices.


   .. rubric:: Examples

   >>> select_valid_choices(list("abcde"), (1, 0, 1, 0, 1))
   ['a', 'c', 'e']
   >>> select_valid_choices(list("abc"), (0, 1, 0, 1, 0))
   ['b']


   ..
       !! processed by numpydoc !!

.. py:function:: apply_law_of_motion_for_core(df, optim_paras)

   
   Apply the law of motion for the core dimensions.

   This function only applies the law of motion for core dimensions which are the
   period, experiences, and previous choices. Depending on the integer-encoded choice
   in ``df["choice"]``, the new state is computed.

   :Parameters:

       **df** : :obj:`pandas.DataFrame`
           The DataFrame contains states with information on the period, experiences,
           previous choices. The current choice is encoded as an integer in a column named
           ``"choice"``.

       **optim_paras** : :class:`python:dict`
           Contains model parameters.

   :Returns:

       **df** : :obj:`pandas.DataFrame`
           The DataFrame contains the states in the next period.


   ..
       !! processed by numpydoc !!

.. py:function:: get_choice_set_from_complex(complex_tuple)

   
   Select the choice set from a complex tuple.


   :Parameters:

       **complex_tuple** : :class:`python:tuple`
           The complex tuple.

   :Returns:

       :obj:`The` :obj:`choice` :obj:`set` :obj:`as` tuple.
           ..


   ..
       !! processed by numpydoc !!

.. py:function:: get_exogenous_from_dense_covariates(dense_covariates, optim_paras)

   
   Select eogenous grid points from dense grid points.


   :Parameters:

       **dense_covariates** : :class:`python:tuple`
           Dense covariates grid point.

       **optim_paras** : :class:`python:dict`
           ..

   :Returns:

       :obj:`The` :obj:`exogenous` :obj:`grid` :class:`python:tuple`
           ..


   ..
       !! processed by numpydoc !!