:py:mod:`respy.simulate`
========================

.. py:module:: respy.simulate

.. autoapi-nested-parse::

   Everything related to the simulation of data with structural models.

   ..
       !! processed by numpydoc !!


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   respy.simulate.get_simulate_func
   respy.simulate.simulate
   respy.simulate.apply_law_of_motion_for_dense
   respy.simulate.update_dense_state_variables
   respy.simulate._extend_data_with_sampled_characteristics
   respy.simulate._simulate_single_period
   respy.simulate.draw_dense_key_next_period
   respy.simulate._sample_characteristic
   respy.simulate._convert_codes_to_original_labels
   respy.simulate._process_simulation_output
   respy.simulate._random_choice
   respy.simulate._harmonize_simulation_arguments
   respy.simulate._process_input_df_for_simulation


.. py:function:: get_simulate_func(params, options, method='n_step_ahead_with_sampling', df=None, n_simulation_periods=None)

   
   Get the simulation function.

   Return :func:`simulate` where all arguments except the parameter vector are fixed
   with :func:`functools.partial`. Thus, the function can be directly passed into an
   optimizer for estimation with simulated method of moments or other techniques.

   :Parameters:

       **params** : :obj:`pandas.DataFrame`
           DataFrame containing model parameters.

       **options** : :class:`python:dict`
           Dictionary containing model options.

       **method** : {"n_step_ahead_with_sampling", "n_step_ahead_with_data", "one_step_ahead"}
           The simulation method which can be one of three and is explained in more detail
           in :func:`simulate`.

       **df** : :obj:`pandas.DataFrame` or :data:`python:None`, default :data:`python:None`
           DataFrame containing one or multiple observations per individual.

       **n_simulation_periods** : :class:`python:int` or :data:`python:None`, default :data:`python:None`
           Simulate data for a number of periods. This options does not affect
           ``options["n_periods"]`` which controls the number of periods for which decision
           rules are computed.

   :Returns:

       **simulate_function** : :func:`simulate`
           Simulation function where all arguments except the parameter vector are set.


   .. rubric:: Examples

   >>> import respy as rp
   >>> params, options = rp.get_example_model("robinson_crusoe_basic", with_data=False)
   >>> simulate = rp.get_simulate_func(params, options)
   >>> data = simulate(params)


   ..
       !! processed by numpydoc !!

.. py:function:: simulate(params, base_draws_sim, base_draws_wage, df, method, n_simulation_periods, solve, options)

   
   Perform a simulation.

   This function performs one of three possible simulation exercises. The type of the
   simulation is controlled by ``method`` in :func:`get_simulate_func`. Ordered from no
   data to panel data on individuals, there is:

   1. *n-step-ahead simulation with sampling*: The first observation of an individual
      is sampled from the initial conditions, i.e., the distribution of observed
      variables or initial experiences, etc. in the first period. Then, the individuals
      are guided for ``n`` periods by the decision rules from the solution of the
      model.

   2. *n-step-ahead simulation with data*: Instead of sampling individuals from the
      initial conditions, take the first observation of each individual in the data.
      Then, do as in 1..

   3. *one-step-ahead simulation*: Take the complete data and find for each observation
      the corresponding outcomes, e.g, choices and wages, using the decision rules from
      the model solution.

   :Parameters:

       **params** : :obj:`pandas.DataFrame` or :obj:`pandas.Series`
           Contains parameters.

       **base_draws_sim** : :obj:`numpy.ndarray`
           Array with shape (n_periods, n_individuals, n_choices) to provide a unique set
           of shocks for each individual in each period.

       **base_draws_wage** : :obj:`numpy.ndarray`
           Array with shape (n_periods, n_individuals, n_choices) to provide a unique set
           of wage measurement errors for each individual in each period.

       **df** : :obj:`pandas.DataFrame` or :data:`python:None`
           Can be one three objects:
           
           - :data:`None` if no data is provided. This triggers sampling from initial
             conditions and a n-step-ahead simulation.
           - :class:`pandas.DataFrame` containing panel data on individuals which triggers
             a one-step-ahead simulation.
           - :class:`pandas.DataFrame` containing only first observations which triggers a
             n-step-ahead simulation taking the data as initial conditions.

       **method** : :class:`python:str`
           The simulation method.

       **n_simulation_periods** : :class:`python:int`
           Number periods to simulate.

       **solve** : :func:`~respy.solve.solve`
           Function which creates the solution of the model with new parameters.

       **options** : :class:`python:dict`
           Contains model options.

   :Returns:

       **simulated_data** : :obj:`pandas.DataFrame`
           DataFrame of simulated individuals.


   ..
       !! processed by numpydoc !!

.. py:function:: apply_law_of_motion_for_dense(df, state_space, optim_paras)

   
   Update dense variable, if exogenous process.


   :Parameters:

       **df** : :obj:`pandas.DataFrame`
           A pandas DataFrame containing the updated state variables, as well as the
           draw of next periods dense key.

       **state_space**
           ..

       **optim_paras**
           ..

   :Returns:

       **df** : :obj:`pandas.DataFrame`
           A pandas DataFrame containing the updated state variables and the updated
           exogenous process.


   ..
       !! processed by numpydoc !!

.. py:function:: update_dense_state_variables(df, dense_key_to_dense_covariates, optim_paras)

   
   Update the value of the exogenous processes.


   :Parameters:

       **df** : :obj:`pandas.DataFrame`
           A pandas DataFrame containing the updated state variables, as well as the
           draw of next periods dense key.

       **dense_key_to_dense_covariates** : :class:`python:dict`
           Dictionary with dense_key as keys and dense grid points.

       **optim_paras** : :class:`python:dict`
           ..

   :Returns:

       **df** : :obj:`pandas.DataFrame`
           A pandas DataFrame containing the updated state variables and the updated
           exogenous process.


   ..
       !! processed by numpydoc !!

.. py:function:: _extend_data_with_sampled_characteristics(df, optim_paras, options)

   
   Sample initial observations from initial conditions.

   The function iterates over all state space dimensions and replaces NaNs with values
   sampled from initial conditions. In the case of an n-step-ahead simulation with
   sampling all state space dimensions are sampled. For the other two simulation
   methods, potential NaNs in the data are replaced with sampled characteristics.

   Characteristics are sampled regardless of the simulation type which keeps randomness
   across the types constant.

   :Parameters:

       **df** : :obj:`pandas.DataFrame`
           A pandas DataFrame which contains only an index for n-step-ahead simulation with
           sampling. For the other simulation methods, it contains information on
           individuals which is allowed to have missing information in the first period.

       **optim_paras** : :class:`python:dict`
           ..

       **options** : :class:`python:dict`
           ..

   :Returns:

       **df** : :obj:`pandas.DataFrame`
           A pandas DataFrame with no missing values.


   ..
       !! processed by numpydoc !!

.. py:function:: _simulate_single_period(df, complex_tuple, wages, nonpecs, continuation_values, optim_paras, options)

   
   Simulate individuals in a single period.

   The function performs the following sets:

   - Map individuals in one period to the states in the model.
   - Simulate choices and wages for those individuals.
   - Store additional information in a :class:`pandas.DataFrame` and return it.

   Until now this function assumes that there are no mixed constraints.
   See docs for more information!


   ..
       !! processed by numpydoc !!

.. py:function:: draw_dense_key_next_period(complex_tuple, core_index, options)

   
   For exogenous processes draw the dense key for next period.


   :Parameters:

       **complex_tuple**
           ..

       **core_index**
           ..

       **options**
           ..

   :Returns:

       **dense_key_next_period** : pd:Series
           A pandas Series containing the dense keys in the next period for all keys.


   ..
       !! processed by numpydoc !!

.. py:function:: _sample_characteristic(states_df, options, level_dict, use_keys)

   
   Sample characteristic of individuals.

   The function is used to sample the values of one state space characteristic, say
   experience. The keys of ``level_dict`` are the possible starting values of
   experience. The values of the dictionary are :class:`pandas.Series` whose index are
   covariate names and the values are the parameter values.

   ``states_df`` is used to generate all possible covariates with the existing
   information.

   For each level, the dot product of parameters and covariates determines the value
   ``z``. The softmax function converts the level-specific ``z``-values to
   probabilities. The probabilities are used to sample the characteristic.

   :Parameters:

       **states_df** : :obj:`pandas.DataFrame`
           Contains the state of each individual.

       **options** : :class:`python:dict`
           Options of the model.

       **level_dict** : :class:`python:dict`
           A dictionary where the keys are the values distributed according to the
           probability mass function. The values are a :class:`pandas.Series` with
           covariate names as the index and parameter values.

       **use_keys** : :ref:`bool <python:bltin-boolean-values>`
           Identifier for whether the keys of the level dict should be used as variables
           values or use numeric codes instead. For example, assign numbers to choices.

   :Returns:

       **characteristic** : :obj:`numpy.ndarray`
           Array with shape (n_individuals,) containing sampled values.


   ..
       !! processed by numpydoc !!

.. py:function:: _convert_codes_to_original_labels(df, optim_paras)

   
   Convert codes in choice-related and observed variables to labels.


   ..
       !! processed by numpydoc !!

.. py:function:: _process_simulation_output(data, optim_paras)

   
   Create simulated data.

   This function takes an array of simulated outcomes and additional information for
   each period and stacks them together to one DataFrame.

   :Parameters:

       **data** : :class:`python:list`
           List of DataFrames for each simulated period with internal codes and labels.

       **optim_paras** : :class:`python:dict`
           ..

   :Returns:

       **df** : :obj:`pandas.DataFrame`
           DataFrame with simulated data.


   ..
       !! processed by numpydoc !!

.. py:function:: _random_choice(choices, probabilities=None, decimals=5)

   
   Return elements of choices for a two-dimensional array of probabilities.

   It is assumed that probabilities are ordered (n_samples, n_choices).

   The function is taken from this `StackOverflow post
   <https://stackoverflow.com/questions/40474436>`_ as a workaround for
   :func:`numpy.random.choice` as it can only handle one-dimensional probabilities.


   .. rubric:: Examples

   Here is an example with non-zero probabilities.

   >>> n_samples = 100_000
   >>> n_choices = 3
   >>> p = np.array([0.15, 0.35, 0.5])
   >>> ps = np.tile(p, (n_samples, 1))
   >>> choices = _random_choice(n_choices, ps)
   >>> np.round(np.bincount(choices), decimals=-3) / n_samples
   array([0.15, 0.35, 0.5 ])

   Here is an example where one choice has probability zero.

   >>> choices = np.arange(3)
   >>> p = np.array([0.4, 0, 0.6])
   >>> ps = np.tile(p, (n_samples, 1))
   >>> choices = _random_choice(3, ps)
   >>> np.round(np.bincount(choices), decimals=-3) / n_samples
   array([0.4, 0. , 0.6])


   ..
       !! processed by numpydoc !!

.. py:function:: _harmonize_simulation_arguments(method, df, n_simulation_periods, options)

   
   Harmonize the arguments of the simulation.

   This function handles the interaction of the four inputs and aligns the number of
   simulated individuals and the number of simulated periods.


   ..
       !! processed by numpydoc !!

.. py:function:: _process_input_df_for_simulation(df, method, options, optim_paras)

   
   Process a :class:`pandas.DataFrame` provided by the user for the simulation.


   ..
       !! processed by numpydoc !!