respy.pre_processing.process_covariates#

This module comprises all functions which process the definition of covariates.

Module Contents#

Functions#

remove_irrelevant_covariates(options, params)

Identify the relevant covariates.

separate_covariates_into_core_dense_mixed(options, ...)

Separate covariates into distinct groups.

identify_necessary_covariates(dependents, definitions)

Identify covariates necessary to compute dependents.

respy.pre_processing.process_covariates.remove_irrelevant_covariates(options, params)[source]#

Identify the relevant covariates.

We try to make every model as sparse as possible which means discarding covariates which are irrelevant. The immediate benefit is that memory consumption and start-up costs are reduced.

An advantage further downstream is that the number of lagged choices is inferred from covariates. Eliminating irrelevant covariates might reduce the number of implemented lags.

The function catches all relevant “high-level” covariates by looking at the “name” index in params. “Low-level” covariates which are relevant but not included in the index are recursively found by checking whether covariates are used in the formula of relevant covariates.

respy.pre_processing.process_covariates.separate_covariates_into_core_dense_mixed(options, optim_paras)[source]#

Separate covariates into distinct groups.

Covariates are separated into three groups.

  1. Covariates which use only information from the core state space.

  2. Covariates which use only information from the dense state space.

  3. Covariates which use information from the core and the dense state space.

Parameters:
optionsdict

Contains among other information covariates and their formulas.

optim_parasdict

Contains information to separate the core and dense state space.

Returns:
optionsdict

Contains three new covariate categories.

respy.pre_processing.process_covariates.identify_necessary_covariates(dependents, definitions)[source]#

Identify covariates necessary to compute dependents.

This function can be used if only a specific subset of covariates is necessary and not all covariates.