Inference

Inference in chi heavily relies on the inference package pints.

The OptimisationController and SamplingController allow you to easily explore different optimisation or sampling settings, e.g. using different methods, fixing some parameters, or applying different transformations to the search space.

Classes

Detailed API

class chi.OptimisationController(log_posterior, seed=None)[source]

Sets up an optimisation routine that attempts to find the parameter values that maximise a pints.LogPosterior. If multiple log-posteriors are provided, the posteriors are assumed to be structurally identical and only differ due to different data sources.

By default the optimisation is run 5 times from different initial starting points. Starting points are randomly sampled from the specified pints.LogPrior. The optimisation is run by default in parallel using pints.ParallelEvaluator.

Extends InferenceController.

Parameters:
run(n_max_iterations=10000, show_run_progress_bar=False, log_to_screen=False)[source]

Runs the optimisation and returns the maximum a posteriori probability parameter estimates in from of a pandas.DataFrame with the columns ‘ID’, ‘Parameter’, ‘Estimate’, ‘Score’ and ‘Run’.

The number of maximal iterations of the optimisation routine can be limited by setting n_max_iterations to a finite, non-negative integer value.

Parameters:
  • n_max_iterations – The maximal number of optimisation iterations to find the MAP estimates for each log-posterior. By default the maximal number of iterations is set to 10000.

  • show_run_progress_bar – A boolean flag which indicates whether a progress bar for looping through the optimisation runs is displayed.

  • log_to_screen – A boolean flag which indicates whether the optimiser logging output is displayed.

set_n_runs(n_runs)

Sets the number of times the inference routine is run.

Each run starts from a random sample of the log-prior.

set_optimiser(optimiser)[source]

Sets method that is used to find the maximum a posteiori probability estimates.

set_parallel_evaluation(run_in_parallel)

Enables or disables parallel evaluation using either a pints.ParallelEvaluator or a pints.SequentialEvaluator.

If run_in_parallel=True, the method will run using a number of worker processes equal to the detected CPU core count. The number of workers can be set explicitly by setting run_in_parallel to an integer greater than 0. Parallelisation can be disabled by setting run_in_parallel to 0 or False.

set_transform(transform)

Sets the transformation that transforms the parameter space into the search space.

Transformations of the search space can significantly improve the performance of the inference routine.

transform has to be an instance of pints.Transformation and must have the same dimension as the parameter space.

class chi.SamplingController(log_posterior, seed=None)[source]

Sets up a sampling routine that attempts to find the posterior distribution of parameters defined by a pints.LogPosterior. If multiple log-posteriors are provided, the posteriors are assumed to be structurally identical and only differ due to different data sources.

By default the sampling is run 5 times from different initial starting points. Starting points are randomly sampled from the specified pints.LogPrior. The optimisation is run by default in parallel using pints.ParallelEvaluator.

Extends InferenceController.

Parameters:
run(n_iterations=10000, hyperparameters=None, log_to_screen=False)[source]

Runs the sampling routine and returns the sampled parameter values in form of a xarray.Dataset with xarray.DataArray instances for each parameter.

If multiple posteriors are inferred a list of xarray.Dataset instances is returned.

The number of iterations of the sampling routine can be set by setting n_iterations to a finite, non-negative integer value. By default the routines run for 10000 iterations.

Parameters:
  • n_iterations (int, optional) – A non-negative integer number which sets the number of iterations of the MCMC runs.

  • hyperparameters (list[float], optional) – A list of hyperparameters for the sampling method. If None the default hyperparameters are set.

  • log_to_screen (bool, optional) – A boolean flag which can be used to print the progress of the runs to the screen. The progress is printed every 500 iterations.

set_n_runs(n_runs)

Sets the number of times the inference routine is run.

Each run starts from a random sample of the log-prior.

set_parallel_evaluation(run_in_parallel)

Enables or disables parallel evaluation using either a pints.ParallelEvaluator or a pints.SequentialEvaluator.

If run_in_parallel=True, the method will run using a number of worker processes equal to the detected CPU core count. The number of workers can be set explicitly by setting run_in_parallel to an integer greater than 0. Parallelisation can be disabled by setting run_in_parallel to 0 or False.

set_sampler(sampler)[source]

Sets method that is used to sample from the log-posterior.

set_transform(transform)

Sets the transformation that transforms the parameter space into the search space.

Transformations of the search space can significantly improve the performance of the inference routine.

transform has to be an instance of pints.Transformation and must have the same dimension as the parameter space.

Utility functions

Detailed API

chi.compute_pointwise_loglikelihood(log_likelihood, posterior_samples, individual=None, param_map=None, per_individual=True, return_inference_data=False, show_chain_progress_bar=False)[source]

Computes the pointwise log-likelihood for each observation and each parameter sample from the posterior distribution.

For a HierarchicalLogLikelihood pointwise log-likelihoods are by default computed and aggregated per individual. If the pointwise log-likelihoods are supposed to be computed per observation, per_individual can be set to False. For more info see HierarchicalLogLikelihood.compute_pointwise_ll().

Parameters:
  • log_likelihood (LogLikelihood, HierarchicalLogLikelihood) – The log-likelihood of the model parameters.

  • posterior_samples (xarray.Dataset) – Samples from the posterior distribution of the model parameters.

  • individual (str, optional) – The individual for which the log-likelihoods are evaluated. If None the first individual is chosen.

  • param_map (dict, optional) – A dictionary which can be used to map log-likelihood parameter names to the parameter names in the xarray.Dataset. If None, it is assumed that the names are identical. For hierarchical models top and bottom names can be mapped, IDs excluded.

  • per_individual (bool, optional) – A boolean flag that determines whether the scores are computed per individual or per observation.

  • return_inference_data (bool, optional) – A boolean flag which determines whether the log-likelihoods and the posterior are returned as arviz.InferenceData.

  • show_chain_progress_bar (bool, optional) – A boolean flag which determines whether the progress for each chain is visualised as a progress bar.

Base classes

Detailed API

class chi.InferenceController(log_posterior, seed=None)[source]

A base class for inference controllers.

Parameters:
set_n_runs(n_runs)[source]

Sets the number of times the inference routine is run.

Each run starts from a random sample of the log-prior.

set_parallel_evaluation(run_in_parallel)[source]

Enables or disables parallel evaluation using either a pints.ParallelEvaluator or a pints.SequentialEvaluator.

If run_in_parallel=True, the method will run using a number of worker processes equal to the detected CPU core count. The number of workers can be set explicitly by setting run_in_parallel to an integer greater than 0. Parallelisation can be disabled by setting run_in_parallel to 0 or False.

set_transform(transform)[source]

Sets the transformation that transforms the parameter space into the search space.

Transformations of the search space can significantly improve the performance of the inference routine.

transform has to be an instance of pints.Transformation and must have the same dimension as the parameter space.