Log-PDFs

Log-PDFs in chi extend the log-pdfs provided in pints for the purpose of PKPD modelling.

Classes

HierarchicalLogLikelihood
HierarchicalLogPosterior
LogLikelihood
LogPosterior
PopulationFilterLogPosterior

Detailed API

class chi.HierarchicalLogLikelihood(log_likelihoods, population_model, covariates=None)[source]

A hierarchical log-likelihood consists of structurally identical log-likelihoods whose parameters are governed by a population model.

A hierarchical log-likelihood is defined by a list of LogLikelihood instances and a PopulationModel

\[\log p(\mathcal{D}, \Psi | \theta ) = \sum _{i} \log p(\mathcal{D}_i | \psi _{i}) + \sum _{i} \log p(\psi _{i}| \theta),\]

where the first term is the sum over the log-likelihoods and the second term is the log-likelihood of the population model parameters. \(\mathcal{D}=\{ \mathcal{D}_i\}\) is the data, where \(\mathcal{D}_i) = \{(y_{ij}, t_{ij})\}\) denotes the measurement from log-likelihood \(i\). \(\Psi = \{ \psi_i\}\) denotes the parameters across the individual log-likelihoods.

Parameters:

log_likelihoods (list[LogLikelihood] of length n_ids) – A list of log-likelihoods which are defined on the same parameter space with dimension n_parameters.
population_models (PopulationModel) – A population model of dimension n_parameters.
covariates (np.ndarray of shape (n_ids, n_cov), optional) – A 2-dimensional array of with the individual’s covariates.

compute_pointwise_ll(parameters, per_individual=True)[source]

Returns the pointwise log-likelihood scores of the parameters for each observation.

Parameters:

parameters (list, numpy.ndarray) – A list of parameter values.
per_individual (bool, optional) – A boolean flag that determines whether the scores are computed per individual or per observation.

evaluateS1(parameters)[source]

Computes the log-likelihood of the parameters and its sensitivities.

Parameters:: parameters (list, numpy.ndarray) – A list of parameter values

get_id(unique=False)[source]

Returns the IDs (prefixes) of the model parameters.

By default the IDs of all parameters (bottom and top level) parameters are returned in the order of the parameter names. If unique is set to True, each ID is returned only once.

get_parameter_names(exclude_bottom_level=False, include_ids=False)[source]

Returns the names of the model.

Parameters:

exclude_bottom_level (bool, optional) – A boolean flag which determines whether the bottom-level parameter names are returned in addition to the top-level parameters.
include_ids (bool, optional) – A boolean flag which determines whether the IDs (prefixes) of the model parameters are included.

get_population_model()[source]: Returns the population model.

n_log_likelihoods()[source]: Returns the number of individual likelihoods.

n_observations()[source]: Returns the number of observed data points per individual.

n_parameters(exclude_bottom_level=False)[source]

Returns the number of parameters.

Parameters:: exclude_bottom_level (bool, optional) – A boolean flag which determines whether the bottom-level parameter are counted in addition to the top-level parameters.

class chi.HierarchicalLogPosterior(log_likelihood, log_prior)[source]

A hierarchical log-posterior is defined by a hierarchical log-likelihood and a log-prior for the population (or top-level) parameters.

The log-posterior takes an instance of a HierarchicalLogLikelihood and an instance of a pints.LogPrior of the same dimensionality as population (or top-level) parameters in the log-likelihood.

Formally the log-posterior is defined as

\[\log p(\Psi , \theta | \mathcal{D}) = \log p(\mathcal{D}, \Psi | \theta) + \log p(\theta ) + \text{constant},\]

where \(\Psi\) are the bottom-level parameters, \(\theta\) are the top-level parameters and \(\mathcal{D}\) is the data, see HierarchicalLogLikelihood.

Extends pints.LogPDF.

Parameters:

log_likelihood (HierarchicalLogLikelihood) – A log-likelihood for the individual and population parameters.
log_prior (pints.LogPrior) – A log-prior for the population (or top-level) parameters.

evaluateS1(parameters)[source]

Returns the log-posterior score and its sensitivities to the model parameters.

Parameters:: parameters (List[float], numpy.ndarray) – An array-like object with parameter values.

get_id(unique=False)[source]

Returns the ids of the log-posterior’s parameters. If the ID is None corresponding parameter is defined on the population level.

Parameters:: unique (bool, optional) – A boolean flag which indicates whether each ID is only returned once, or whether the IDs of all paramaters are returned.

get_log_likelihood()[source]: Returns the log-likelihood.

get_log_prior()[source]: Returns the log-prior.

get_parameter_names(exclude_bottom_level=False, include_ids=False)[source]

Returns the names of the parameters.

Parameters:

exclude_bottom_level (bool, optional) – A boolean flag which determines whether the bottom-level parameter names are returned in addition to the top-level parameters.
include_ids (bool, optional) – A boolean flag which determines whether the IDs (prefixes) of the model parameters are included.

get_population_model()[source]: Returns the population model.

n_ids()[source]: Returns the number of modelled individuals.

n_parameters(exclude_bottom_level=False)[source]

Returns the number of parameters.

Parameters:: exclude_bottom_level (bool, optional) – A boolean flag which determines whether the bottom-level parameter are counted in addition to the top-level parameters.

sample_initial_parameters(n_samples=1, seed=None)[source]

Samples top-level parameters from the log-prior and bottom-level parameters from the population model using the top-level samples.

These parameter samples may be used to initialise inference algorithms.

Parameters:

n_samples (int, optional) – Number of samples.
seed (int, optional) – Seed for random number generator.

Return type:

np.ndarray of shape (n_samples, n_parameters)

class chi.LogLikelihood(mechanistic_model, error_model, observations, times, outputs=None)[source]

A log-likelihood that quantifies the likelihood of parameter values to capture the measurements within the model approximation of the data-generating process.

A log-likelihood is defined by an instance of a MechanisticModel, one ErrorModel for each mechanistic model output and measurements defined by observations and times

\[p(\mathcal{D} | \psi) = \sum _{j=1} \log p(y_j | \psi, t_j),\]

where \(\mathcal{D} = \{(y_j , t_j)\}\) denotes the measurements.

Extends pints.LogPDF.

Parameters:

mechanistic_model (MechanisticModel) – A mechanistic model that models the simplified behaviour of the biomarkers.
error_model (ErrorModel, list[ErrorModel]) – One error model for each output of the mechanistic model. For multiple ouputs the error models are expected to be ordered according to the outputs.
observations (list[float], list[list[float]]) – A list of one dimensional array-like objects with measured values of the biomarkers. The list is expected to be ordered in the same way as the mechanistic model outputs.
times (list[float], list[list[float]]) – A list of one dimensional array-like objects with measured times associated to the observations.
outputs (list[str], optional) – A list of output names, which sets the mechanistic model outputs. If None the currently set outputs of the mechanistic model are assumed.

Example

import chi

# Define mechanistic and error model
sbml_file = chi.ModelLibrary().tumour_growth_inhibition_model_koch()
mechanistic_model = chi.PharmacodynamicModel(sbml_file)
error_model = chi.ConstantAndMultiplicativeGaussianErrorModel()

# Define observations
observations = [1, 2, 3, 4]
times = [0, 0.5, 1, 2]

# Create log-likelihood
log_likelihood = chi.LogLikelihood(
    mechanistic_model,
    error_model,
    observations,
    times)

# Compute log-likelihood score
parameters = [1, 1, 1, 1, 1, 1, 1]
score = log_likelihood(parameters)  # -5.4395320556329265

compute_pointwise_ll(parameters)[source]

Returns the pointwise log-likelihood scores of the parameters for each observation.

Parameters:: parameters (list, numpy.ndarray) – A list of parameter values

evaluateS1(parameters)[source]

Computes the log-likelihood of the parameters and its sensitivities.

Parameters:: parameters (list, numpy.ndarray) – A list of parameter values

fix_parameters(name_value_dict)[source]

Fixes the value of model parameters, and effectively removes them as a parameter from the model. Fixing the value of a parameter at None sets the parameter free again.

Parameters:: name_value_dict (dict[str, float]) – A dictionary with model parameter names as keys, and parameter value as values.

get_id(*args, **kwargs)[source]

Returns the ID of the log-likelihood. If not set, None is returned.

The ID is used as meta data to identify the origin of the data.

get_parameter_names()[source]: Returns the parameter names of the predictive model.

get_submodels()[source]: Returns the submodels of the log-likelihood in form of a dictionary.

Warning

The returned submodels are only references to the models used by the log-likelihood. Changing e.g. the dosing regimen of the MechanisticModel will therefore also change the dosing regimen of the log-likelihood!

n_observations()[source]: Returns the number of observed data points for each output.

n_parameters()[source]: Returns the number of parameters.

set_id(label)[source]

Sets the ID of the log-likelihood.

The ID is used as meta data to identify the origin of the data.

Parameters:: label (str) – Integer value which is used as ID for the log-likelihood.

class chi.LogPosterior(log_likelihood, log_prior)[source]

A log-posterior constructed from a log-likelihood and a log-prior.

The log-posterior takes an instance of a LogLikelihood and an instance of a pints.LogPrior of the same dimensionality as parameters in the log-likelihood.

Formally the log-posterior is given by the sum of the log-likelihood, \(\log p(\mathcal{D} | \psi)\), and the log-prior, \(\log p(\psi )\), up to an additive constant

\[\log p(\psi | \mathcal{D}) = \log p(\mathcal{D} | \psi) + \log p(\psi ) + \mathrm{constant},\]

where \(\psi\) are the parameters of the log-likelihood and \(\mathcal{D}\) are the observed data. The additive constant is the normalisation of the log-posterior and is in general unknown.

Extends pints.LogPDF.

Parameters:

log_likelihood (LogLikelihood) – A log-likelihood for the model parameters.
log_prior (pints.LogPrior) – A log-prior for the model parameters. The log-prior has to have the same dimensionality as the log-likelihood.

Example

import chi
import pints

# Define mechanistic and error model
sbml_file = chi.ModelLibrary().tumour_growth_inhibition_model_koch()
mechanistic_model = chi.PharmacodynamicModel(sbml_file)
error_model = chi.ConstantAndMultiplicativeGaussianErrorModel()

# Define observations
observations = [1, 2, 3, 4]
times = [0, 0.5, 1, 2]

# Create log-likelihood
log_likelihood = chi.LogLikelihood(
    mechanistic_model,
    error_model,
    observations,
    times)

# Define log-prior
log_prior = pints.ComposedLogPrior(
    pints.LogNormalLogPrior(1, 1),
    pints.LogNormalLogPrior(1, 1),
    pints.LogNormalLogPrior(1, 1),
    pints.LogNormalLogPrior(1, 1),
    pints.LogNormalLogPrior(1, 1),
    pints.HalfCauchyLogPrior(0, 1),
    pints.HalfCauchyLogPrior(0, 1))

# Create log-posterior
log_posterior = chi.LogPosterior(log_likelihood, log_prior)

# Compute log-posterior score
parameters = [1, 1, 1, 1, 1, 1, 1]
score = log_posterior(parameters)  # -14.823684493355092

evaluateS1(parameters)[source]

Returns the log-posterior score and its sensitivities to the model parameters.

Parameters:: parameters (List[float], numpy.ndarray) – An array-like object with parameter values.

get_id(*args, **kwargs)[source]: Returns the id of the log-posterior. If no id is set, None is returned.

get_log_likelihood()[source]: Returns the log-likelihood.

get_log_prior()[source]: Returns the log-prior.

get_parameter_names(*args, **kwargs)[source]: Returns the names of the model parameters. By default the parameters are enumerated and assigned with the names ‘Param #’.

n_parameters(*args, **kwargs)[source]: Returns the number of parameters of the posterior.

sample_initial_parameters(n_samples=1, seed=None)[source]

Samples parameters from the log-prior which may be used to initialise inference algorithms.

Parameters:

n_samples (int, optional) – Number of samples.
seed (int, optional) – Seed for random number generator.

Return type:

np.ndarray of shape (n_samples, n_parameters)

class chi.PopulationFilterLogPosterior(population_filter, times, mechanistic_model, population_model, log_prior, sigma=None, error_on_log_scale=False, n_samples=100, covariates=None)[source]

A population filter log-posterior approximates a hierarchical log-posterior.

Population filter log-posteriors can be used to approximate hierarchical log-posteriors when exact hierarchical inference becomes numerically intractable. The canonical application of population filter inference is the inference from time series snapshot data.

The population filter log-posterior is defined by a population filter, a mechanistic model, an error model, a population model and the data

\[\begin{split}\log p(\theta , \tilde{Y}, \Psi| \mathcal{D}) =& \sum _{ij} \log p (y_{ij} | \tilde{Y}_j) + \sum _{sj} \log p (\tilde{y}_{sj} | \psi_s, t_j) + \sum _{s} \log p (\psi_{s} | \theta)& \\ &+ \log p (\theta) + \mathrm{constant},&\end{split}\]

where the data \(\mathcal{D} = \{ (Y_j , t_j)\}\) are measurements over time with \(Y_j = \{ y_{ij} \}\) denoting the measurements at time point \(t_j\) across individuals. Here, we use \(i\) to index individuals from the dataset. The first term of the log-posterior is the population filter contribution which estimates the log-likelihood that the virtual measurements, \(\tilde{Y}_j = \{ \tilde{y}_{sj}\}\), come from the same distribution as the measurements, \(Y_j\). The quality of the log-likelihood estimate is subject to the appropriateness of the population filter [ref]. We use \(s\) to index virtual individuals. The second term of the log-posterior is the log-likelihood of the simulated parameters \(\Psi = \{ \psi _s\}\) with respect to the virtual measurements. Each simulated parameter corresponds to a virtual individual. The log-likelihood of a set of simulated parameters is defined by the mechanistic model and the error model, as well as the simulated measurements for that individual. The third term is the log-likelihood that the population parameters \(\theta = \{ \theta _k \}\) govern the distribution of the individual parameters. The final contribution is from the log-prior of the population parameters.

Note that the choice of population filter makes assumptions about the distributional shape of the measurements which can influence the inference results.

Parameters:

population_filter (chi.PopulationFilter) – The population filter which connects the observations to the simulated measurements.
times (np.ndarray of shape (n_times,)) – Measurement time points of the data.
mechanistic_model (chi.MechanisticModel) – A mechanistic model for the dynamics. The outputs of the mechanistic model are expected to be in the same order as the observables in observations.
population_model – A population model with the same dimensionality as the number of mechanistic model parameters. The dimensions are expected to be in the same order as the model parameters.
log_prior (pints.LogPrior) – Log-prior for the population level parameters. The prior dimensions are expected to be in the order of the population models.
sigma (List[float] of length n_observables, optional) – Standard deviation of the Gaussian error model. If None the parameter is inferred from the data.
error_on_log_scale (bool, optional) – A boolean flag indicating whether the error model models the residuals of the mechanistic model directly or on a log scale.
n_samples (int, optional.) – Number of simulated individuals per evaluation.
covariates (np.ndarray of shape (n_cov,) or (n_samples, n_cov), optional) – Covariates of the simulated individuals.

evaluateS1(parameters)[source]

Returns the log-posterior score and its sensitivities to the model parameters.

Parameters:: parameters (List[float], numpy.ndarray of length n_parameters) – An array-like object with parameter values.

get_id(unique=False)[source]

Returns the ids of the log-posterior’s parameters. If the ID is None corresponding parameter is defined on the population level.

Parameters:: unique (bool, optional) – A boolean flag which indicates whether each ID is only returned once, or whether the IDs of all paramaters are returned.

get_log_likelihood()[source]

Returns the log-likelihood.

For the population filter log-posterior the population filter is returned.

get_log_prior()[source]: Returns the log-prior.

get_parameter_names(exclude_bottom_level=False, include_ids=False)[source]

Returns the names of the parameters.

Parameters:

exclude_bottom_level (bool, optional) – A boolean flag which determines whether the bottom-level parameter names are returned in addition to the top-level parameters.
include_ids (bool, optional) – A boolean flag which determines whether the IDs (prefixes) of the model parameters are included.

get_population_model()[source]: Returns the population model.

n_ids(): Returns the number of modelled individuals.

n_parameters(exclude_bottom_level=False)[source]

Returns the number of parameters.

Parameters:: exclude_bottom_level (bool, optional) – A boolean flag which determines whether the bottom-level parameter are counted in addition to the top-level parameters.

n_samples()[source]: Returns the number of simulated individuals per posterior evaluation.

sample_initial_parameters(n_samples=1, seed=None)[source]

Samples top-level parameters from the log-prior and bottom-level parameters from the population model using the top-level samples. The noise realisations are sampled from a standard Gaussian distribution.

These parameter samples may be used to initialise inference algorithms.

Parameters:

n_samples (int, optional) – Number of samples.
seed (int, optional) – Seed for random number generator.

Return type:

np.ndarray of shape (n_samples, n_parameters)