1. Quick overview

The quick overview displays some of Chi’s main features to help you decide whether Chi is the right software for you. The running example for this overview will be a 1-compartment pharmacokinetic model (any other deterministic time series model would have been an equally good choice, but the 1-compartment pharmacokinetic model happens to be predefined in Chi’s model library; chi.library.ModelLibrary). The model is defined by an ordinary differential equation for the drug amount and an algebraic equation for the drug concentration

\[\frac{\mathrm{d}a}{\mathrm{d}t} = -k_e a, \quad c = \frac{a}{v}, \quad a(t=0) = a_0,\]

where \(a\) is the drug amount in the compartment, \(t\) is the time, \(k_e\) is the elimination rate of the drug, \(c\) is the drug concentration, \(v\) is the volume of the compartment and \(a_0\) is the initial drug amount (note that during this quick overview we will omit units for simplicity).

1.1. Simulating a mechanistic model

Using the 1-compartment pharmacokinetic model from Chi’s model library, simulation of the model simply involves specifying the set of model parameters \(\psi = (a_0, v, k_e)\) and the time points of interest.

import chi.library

# Define 1-compartment pharmacokinetic model
mechanistic_model = chi.library.ModelLibrary().one_compartment_pk_model()

# Define parameters
parameters = [
    10,  # Initial drug amount
    2,   # Volume of the compartment
    1,   # Elimination rate
]

# Define evaluation times
times = [0, 0.2, 0.5, 0.6, 1]

# Simulate the model
result = mechanistic_model.simulate(parameters, times)

The simulation returns a numpy.ndarray with the simulated drug concentrations at the specified times \([c(t=0), c(t=0.2), c(t=0.5), c(t=0.6), c(t=1)]\).

>>> result
array([[5.        , 4.09359579, 3.03157658, 2.74324885, 1.83903834]])

The simulation results are of shape (n_outputs, n_times) which in this case is (1, 5) because we simuated the drug concentration for 5 different time points. For details on how to implement your own chi.MechanisticModel and many other details concerning mechanistic models in Chi, we refer to section The mechanistic model.

1.2. Visualisation of the simulation

The results of the simulation can be visualised with any of the standard Python visualisation libraries (e.g. matplotlib, seaborn, etc.). We will use plotly in this overview. Below is an example of how to visualise the simulation results, where we also made the time course of the drug concentration slightly more interesting by administering bolus doses to the compartment in intervals of 0.5

import numpy as np
import plotly.graph_objects as go

# Set administration and dosing regimen
mechanistic_model.set_administration(
    compartment='central', amount_var='drug_amount')
mechanistic_model.set_dosing_regimen(dose=1, period=0.5)

# Simulate the model
times = np.linspace(start=0, stop=10, num=1000)
result = mechanistic_model.simulate(parameters, times)

# Visualise results
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=times,
    y=result[0],
    mode='lines'
))
fig.update_layout(
    xaxis_title='Time',
    yaxis_title='Drug concentration',
    template='plotly_white'
)
fig.show()

1.3. Simulation of measurements

Measurements are in Chi modelled using a chi.ErrorModel on top of the chi.MechanisticModel . Formally this takes the form of

\[y(\psi, t) = \bar{y}(\psi, t) + \epsilon (\bar{y}, \sigma, t),\]

where \(y\) models the measurements and \(\epsilon\) models the residual error of the mechanistic model output with respect to the measurements. \(\sigma\) are the error model parameters.

The simplest error model is a Gaussian error model \(\epsilon \sim \mathcal{N}(\cdot | 0, \sigma ^2)\). So let us use a Gaussian error model to simulate measurements of the drug concentration using the simulation results from the previous section (we will down sample the measurement times relative to the evaluation times of the mechanistic model for aestetic reasons)

import chi

# Define error model and error model parameters
error_model = chi.GaussianErrorModel()
parameters = [0.2]  # Sigma

# Down sample times and mechanistic model evaluations
measurement_times = times[::25]
corresponding_evaluations = result[0, ::25]

# Simulate measurements
measurements = error_model.sample(
    parameters, model_output=corresponding_evaluations, seed=1)[:, 0]

# Visualise results
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=times,
    y=result[0],
    mode='lines',
    line_color='black',
    name='Mechanistic model'
))
fig.add_trace(go.Scatter(
    x=measurement_times,
    y=measurements,
    mode='markers',
    name='Measurements'
))
fig.update_layout(
    xaxis_title='Time',
    yaxis_title='Drug concentration',
    template='plotly_white'
)
fig.show()

For details on how to implement a chi.ErrorModel and many other details concerning error models in Chi, we refer to the API reference Error Models.

1.4. Inference of model parameters

While the simulation of mechanistic model outputs and measurements is an interesting feature of Chi in its own right, the inference of model parameters from real-world measurements is arguably even more interesting. In this overview, we will use the simulated measurements from above to infer model parameters, but in practice the simulated measurements may be straightforwardly replaced by real-world measurements.

Inference in Chi leverages the fact that the mechanistic model and the error model define a distribution for the modelled measurements \(y\). In the case of a Gaussian error model, as in the previous example, the distribution of the measurements is (yet again) a Gaussian distribution whose mean is equal to the mechanistic model output

\[p(y | \psi , \sigma , t) = \mathcal{N}\left( y \, |\, \bar{y}(\psi, t), \sigma ^2 \right) .\]

This distribution can be used to define the log-likelihood of different parameter values for the measurements \(\mathcal{D}=\{ (y_1, t_1), (y_2, t_2), \ldots (y_n, t_n)\}\)

\[\log p(\mathcal{D} | \psi , \sigma) = \sum _{j=1}^n \log p(y_j | \psi , \sigma , t_j).\]

In Chi the log-likelihood of model parameters can be defined using chi.LogLikelihood. A chi.LogLikelihood defines the distribution of the measurements using a chi.MechanisticModel and a chi.ErrorModel , and couples the distribution to the measurements as defined above. The log-likelihood of different parameter values can now be evaluated using the chi.LogLikelihood.__call__() method

# Define log-likelihood
log_likelihood = chi.LogLikelihood(
    mechanistic_model,
    error_model,
    observations=measurements,
    times=measurement_times
)

# Evaluate log-likelihood for made-up parameters
madeup_parameters = [2, 7, 1.4, 3]
score_1 = log_likelihood(madeup_parameters)

# Evaluate log-likelihood for data-generating parameters
true_parameters = [10, 2, 1, 0.2]
score_2 = log_likelihood(true_parameters)
>>> score_1
-86.14214936785024
>>> score_2
10.324673731287309

We can see that the data-generating parameter values have a larger log-likelihood than the made-up parameter values (which should intuitively make sense). For details on how to define a chi.LogLikelihood and many other details concerning log-likelihoods in Chi, we refer to the API reference Log-PDFs.

1.4.1. Maximum likelihood estimation

A popular approach to infer model parameters that best describe the measurements is to find the parameter values that maximise the log-likelihood, a.k.a. maximum likelihood estimation

\[\hat{\psi}, \hat{\sigma} = \mathop{\text{arg}\,\text{max}}_{\psi, \sigma} \log p(\mathcal{D} | \psi , \sigma).\]

In Chi the maximum likelihood estimates of the model parameters can be found using any of the standard Python optimisation libraries such as scipy.optimize. We will use Pints and its implementation of the Covariance Matrix Adaption-Evolution Strategy (CMA-ES) optimisation algorithm, pints.CMAES, to optimise the log-likelihood

import pints

# Run optimisation
initial_parameters = [9, 3, 5, 1]
parameters_mle, score = pints.optimise(
    log_likelihood, initial_parameters, method=pints.CMAES)
>>> parameters_mle
array([10.26564936, 2.01524534, 1.00148417, 0.18456719])
>>> score
10.832126448890321

The optimisation algorithm finds a set of parameter values that is different, but close to the data-generating parameters. Note, however, that the inferred parameters have a larger log-likelihood than the data-generating parameters, which may seem surprising at first. How can a different set of parameters be more likely to describe the measurements than the set of parameters that generated the measurements? A thorough discussion of the shortcomings of maximum likelihood estimation is beyond the scope of this overview, but let us plot the mechanistic model output for the inferred parameters to confirm that the inferred model is indeed marginally closer to the measurements

mle_output = mechanistic_model.simulate(parameters_mle, times)
fig.add_trace(go.Scatter(
    x=times,
    y=mle_output[0],
    mode='lines',
    line_dash='dash',
    name='Inferred model'
))
fig.show()

1.4.2. Bayesian inference

The reason why maximum likelihood estimation can differ from the data-generating parameters is that a finite number of measurements do not uniquely define the data-generating parameters of a probabilistic model. The maximum likelihood estimates can therefore be far away from the data-generating parameters just because the realisation of the measurement noise happens to make other parameter values seem more likely than the data-generating values (in this example the maximum likelihood estimates are clearly quite good). In other words, a finite number of measurements leaves uncertainty about the model parameters, a.k.a. parametric uncertainty. While for real-world measurements the notion of data-generating parameters may seem alien since models only approximate the real data-generating processes, we can generalise the notion of data-generating parameters to being the effective set of parameter values that capture the most about the data-generating process within the limitations of the model approximation. Here, the maximum likelihood estimates can analogously differ substantially from the sought after data-generating parameters.

In Chi the uncertainty of parameter estimates can be estimated using Bayesian inference. In Bayesian inference Bayes’ rule is used to define a distribution of likely parameter values conditional on the measurements, a.k.a. the posterior parameter distribution

\[\log p(\psi, \sigma | \mathcal{D}) = \log p(\mathcal{D} | \psi , \sigma) + \log p(\psi , \sigma ) + \text{constant}.\]

Here, \(\log p(\psi, \sigma | \mathcal{D})\) is the log-posterior, \(\log p(\mathcal{D} | \psi , \sigma)\) is the log-likelihood as defined above and \(\log p(\psi , \sigma )\) is the log-prior distribution of the model parameters. The prior distribution of the model parameters is a modelling choice and captures prior knowlegde about the model parameters.

In Chi the log-posterior can be defined using chi.LogPosterior which is instantiated with a chi.LogLikelihood and a pints.LogPrior. For the sake of this overview, we will use uniform priors that constrain the parameters to values between 0 and 20. In practice, informative priors are likely better choices. The resulting log-posterior can now be evaluated similar to the chi.LogLikelihood using chi.LogPosterior.__call__()

# Define log-posterior
log_prior = pints.ComposedLogPrior(
    pints.UniformLogPrior(0, 20),  # Initial drug amount
    pints.UniformLogPrior(0, 20),  # Compartment volume
    pints.UniformLogPrior(0, 20),  # Elimination rate
    pints.UniformLogPrior(0, 20)   # Sigma
)
log_posterior = chi.LogPosterior(log_likelihood, log_prior)

# Evaluate log-posterior
score_1 = log_posterior(madeup_parameters)
score_2 = log_posterior(true_parameters)
>>> score_1
-104.56283011180261
>>> score_2
-8.096007012665059

For details on how to define a chi.LogPosterior and many other details concerning log-posteriors in Chi, the API reference Log-PDFs.

While the chi.LogPosterior allows us to evaluate the log-posterior up to the constant term for different parameter values it does not yet tell us how the posterior distribution of likely parameter values looks like. This distribution can be inferred from a chi.LogPosterior using e.g. Markov Chain Monte Carlo (MCMC) sampling algorithms. Below we will use Pints’ implementation of Haario and Bardenet’s Adaptive Covariance Matrix MCMC algorithm, pints.HaarioBardenetACMC, to infer the posterior distribution.

# Run inference
controller = chi.SamplingController(log_posterior)
controller.set_sampler(pints.HaarioBardenetACMC)
n_iterations = 5000
posterior_samples = controller.run(n_iterations)

The inferred posterior distributions can now be compared to the data-generating parameters.

from plotly.subplots import make_subplots

# Discard warmup iterations
warmup = 3000  # This number is not arbitrary but was carefully chosen
posterior_samples = posterior_samples.sel(draw=slice(warmup, n_iterations))

# Visualise posteriors
fig = make_subplots(rows=2, cols=2, shared_yaxes=True)
fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['central.drug_amount'].values.flatten(),
        histnorm='probability',
        showlegend=False
    ),
    row=1,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[true_parameters[0]]*2,
        y=[0, 0.04],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=1,
    col=1
)

fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['central.size'].values.flatten(),
        histnorm='probability',
        showlegend=False
    ),
    row=1,
    col=2
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[true_parameters[1]]*2,
        y=[0, 0.04],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=1,
    col=2
)

fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['global.elimination_rate'].values.flatten(),
        histnorm='probability',
        showlegend=False
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[true_parameters[2]]*2,
        y=[0, 0.04],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=2,
    col=1
)

fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['Sigma'].values.flatten(),
        histnorm='probability',
        showlegend=False
    ),
    row=2,
    col=2
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[true_parameters[3]]*2,
        y=[0, 0.04],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=2,
    col=2
)

fig.update_layout(
    xaxis_title='Initial drug amount',
    xaxis2_title='Compartment volume',
    xaxis3_title='Elimination rate',
    xaxis4_title='Sigma',
    yaxis_title='Probability',
    yaxis3_title='Probability',
    template='plotly_white'
)
fig.show()

For details on how to infer posterior distributions in Chi and many other details on MCMC sampling, we refer to section Fitting mechanistic models to data.

1.5. Simulating a population model

Above we have seen how Chi may be used to infer model parameters from measurements. To this end, we assumed that measurements originate from a data-generating process that is approximated by a mechanistic model, an associated error model and a set of data-generating parameters. For measurements of many biological systems this is, however, a too naïve assumption, as measurements are often taken from different, non-identical entities, ranging from measurements across cells to measurements across patients. In these cases, it is well established that differences between individual entities may lead to additional variability in the measurements that cannot be attributed to measurement noise. To separate this inter-individual variability from measurement noise, we need to extend our modelling framework. A natural extension is to assume that differences across individuals can be modelled by different data-generating parameter values for each individual. Below we illustrate how inter-individual variability may mainifest in measurements from 3 patients with different data-generating parameters.

import plotly.colors

# Set different dosing regimen (for illustration)
mechanistic_model.set_dosing_regimen(dose=5, start=3, period=1, duration=0.3)

# Define patient parameters
parameters_patient_1 = [
    10,  # Initial drug amount
    2,   # Volume of the compartment
    1,   # Elimination rate
]
parameters_patient_2 = [
    10,  # Initial drug amount
    2,   # Volume of the compartment
    0.5, # Elimination rate
]
parameters_patient_3 = [
    10,  # Initial drug amount
    2,   # Volume of the compartment
    0.8, # Elimination rate
]
error_model_params = [0.4]

# Measure drug concentrations of patients
n_times = 1000
times = np.linspace(start=0, stop=10, num=n_times)
measurement_times = times[::50]
result_1 = mechanistic_model.simulate(parameters_patient_1, times)[0]
measurements_1 = error_model.sample(
    error_model_params, result_1[::50], seed=41)[:, 0]
result_2 = mechanistic_model.simulate(parameters_patient_2, times)[0]
measurements_2 = error_model.sample(
    error_model_params, result_2[::50], seed=42)[:, 0]
result_3 = mechanistic_model.simulate(parameters_patient_3, times)[0]
measurements_3 = error_model.sample(
    error_model_params, result_3[::50], seed=43)[:, 0]

# Visualise results
fig = go.Figure()
colors = plotly.colors.qualitative.Plotly
fig.add_trace(go.Scatter(
    x=times,
    y=result_1,
    mode='lines',
    line_color=colors[0],
    name='Model patient 1'
))
fig.add_trace(go.Scatter(
    x=times,
    y=result_2,
    mode='lines',
    line_color=colors[1],
    name='Model patient 2'
))
fig.add_trace(go.Scatter(
    x=times,
    y=result_3,
    mode='lines',
    line_color=colors[2],
    name='Model patient 3'
))
fig.add_trace(go.Scatter(
    x=measurement_times,
    y=measurements_1,
    mode='markers',
    marker_color=colors[0],
    name='Meas. patient 1'
))
fig.add_trace(go.Scatter(
    x=measurement_times,
    y=measurements_2,
    mode='markers',
    marker_color=colors[1],
    name='Meas. patient 2'
))
fig.add_trace(go.Scatter(
    x=measurement_times,
    y=measurements_3,
    mode='markers',
    marker_color=colors[2],
    name='Meas. patient 3'
))
fig.update_layout(
    xaxis_title='Time',
    yaxis_title='Drug concentration',
    template='plotly_white'
)
fig.show()

In this case, we assume for simplicity that the inter-individual variability in drug concentration measurements originates from different drug elimination rates, \(k_e\) (a more realistic model would also include inter-individual variations in the compartment volume \(v\)).

1.5.1. Defining a population model

When sufficiently informative measurements are available for each indiviudal, we can use the above introduced inference approaches to learn the data-generating parameters for each individual separately. However, when measurements are of limited availability it is beneficial to infer the model parameters from all available data simultaneously. This requires us to explicitly model the inter-individual variability. A popular approach to model inter-individual variability is non-linear mixed effects modelling, where the parameters of individuals are modelled as a composite of fixed effects \(\mu\) and random effects \(\eta\)

\[\psi = \mu _{\psi} + \eta _{\psi} \quad \text{and}\quad \sigma = \mu _{\sigma} + \eta _{\sigma},\]

Here, \(\mu _{\psi}\) and \(\mu _{\sigma}\) are constants that model the typical parameter values across individuals, and \(\eta _{\psi}\) and \(\eta _{\sigma}\) are random variables that model the inter-individual differences. Note that this assumes that the data-generating parameters of individuals follow a probabilistic distribution, and that any given individual is just a random sample from this distribution. While the mixed-effects picture can be useful to develop some intuition for population models, an alternative picture is to focus on the distribution of parameters that is defined by the mixed-effects model

\[p(\psi, \sigma | \theta),\]

where \(\theta\) are the population model parameters. The population distribution picture has the advantage that we can define the population distribution, \(p(\psi, \sigma | \theta)\), directly, while it is not always trivial to workout the distribution of \(\psi\) from the distribution of \(\eta\). In chi we will therefore adopt the population distribution picture (note however, that this is just a different representation of non-linear mixed effects modelling and the approaches are equivalent). In the remainder, we will for notational ease include \(\sigma\) in the definition of \(\psi\), i.e. \((\psi , \sigma) \rightarrow \psi\). We will also make the hierarchy between the model parameters explicit by refering to \(\psi\) as bottom-level or individual parameters and to \(\theta\) as top-level or population parameters.

With the notion of a population model, we can now construct a data-generating model that explicitly models inter-individual variability in measurements

\[p(y, \psi | \theta , t) = p(y | \psi , t)\, p(\psi | \theta),\]

where \(p(y | \psi , t)\) models the measurements of an individual with data-generating parameters \(\psi\), and \(p(\psi | \theta)\) models the distribution over data-generating parameters in the population.

To make the concept of a population model less abstract let us revisit the 1 compartment pharmacokinetic model from above where the elimination rate varies across patients, and the initial drug amount \(a_0\), the compartment volume \(v\) and the measurement noise \(\sigma\) are the same for all patients. Since all parameters but the elimination rate are the same across patients, the population model takes the form

\[p(\psi | \theta) = \delta (a_0 - \theta _{a_0})\, \delta (v - \theta _{v})\, p(k_e | \theta _{k_e})\, \delta (\sigma - \theta _{\sigma}),\]

where \(\delta (x)\) is Dirac’s delta distribution which has non-zero probability only for \(x=0\). In this example, this ensures that all patients have the same initial drug amount \(a_0 = \theta _{a_0}\), the same compartment volume \(v = \theta _{v}\) and the same measurement noise \(\sigma = \theta _{\sigma}\). In Chi delta distributions can be implemented with a chi.PooledModel. The distribution of the elimination rate in the population, \(p(k_e | \theta _{k_e})\), is a modelling choice. The simplest, sensible choice for \(p(k_e | \theta _{k_e})\) is a lognormal distribution, chi.LogNormalModel, which assumes that the logarithm of \(k_e\) is normally distributed (in the mixed-effects picture: \(\log k_e = \mu _{k_e} + \eta _{k_e}\) with \(\mathcal{N}(\eta _{k_e} | 0, \sigma _{k_e}^2)\)). This ensures that \(k_e\) can only assume positive values (choosing a Gaussian distribution for \(k_e\) makes in this case no sense, because negative elimination rates would increase the drug amount in the compartment over time)

\[p(\psi | \theta) = \delta (a_0 - \theta _{a_0})\, \delta (v - \theta _{v})\, \mathrm{LN}(k_e | \mu _{k_e}, \sigma _{k_e}) \, \delta (\sigma - \theta _{\sigma}).\]

The resulting model has 5 population parameters \(\theta = (\theta _{a_0}, \theta _{v}, \mu _{k_e}, \sigma _{k_e}, \theta _{\sigma})\).

In Chi we can define this population model from instances of chi.PooledModel and chi.LogNormalModel using chi.ComposedPopulationModel. From the population model we can then simulate the distribution of the individual parameters in the population via sampling. Below we have chosen the values of the population parameters \(\theta\) such that the 3 patients from above could have feasibly been sampled from the population distribution.

# Define population model
population_model = chi.ComposedPopulationModel([
    chi.PooledModel(dim_names=['Initial drug amount']),
    chi.PooledModel(dim_names=['Compartment volume']),
    chi.LogNormalModel(dim_names=['Elimination rate']),
    chi.PooledModel(dim_names=['Sigma'])
])

# Define population parameters
population_parameters = [
    10,     # Pooled initial drug amount
    2,      # Pooled compartment volume
    -0.55,  # Log mean elimination rate (mu k_e)
    0.3,    # Log std. elimination rate (sigma k_e)
    0.4     # Pooled measurement noise
]

# Sample individuals from population model
n_ids = 1000
individual_parameters = population_model.sample(
    parameters=population_parameters,
    n_samples=n_ids,
    seed=1
)

# Visualise population model (parameter space)
fig = make_subplots(rows=2, cols=2)
fig.add_trace(
    go.Histogram(
        name='Pop. model samples',
        x=individual_parameters[:, 0],
        histnorm='probability',
        showlegend=False,
        xbins_size=0.01,
        marker_color='lightgrey'
    ),
    row=1,
    col=1
)
fig.add_trace(
    go.Histogram(
        name='Pop. model samples',
        x=individual_parameters[:, 1],
        histnorm='probability',
        showlegend=False,
        xbins_size=0.01,
        marker_color='lightgrey'
    ),
    row=1,
    col=2
)
fig.add_trace(
    go.Histogram(
        name='Pop. model samples',
        x=individual_parameters[:, 2],
        histnorm='probability',
        showlegend=False,
        marker_color='lightgrey'
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Histogram(
        name='Pop. model samples',
        x=individual_parameters[:, 3],
        histnorm='probability',
        showlegend=False,
        xbins_size=0.01,
        marker_color='lightgrey'
    ),
    row=2,
    col=2
)
fig.add_trace(
    go.Scatter(
        name='Patient 1',
        x=[parameters_patient_1[2], parameters_patient_1[2]],
        y=[0, 0.12],
        mode='lines',
        line=dict(color=colors[0], dash='dash'),
        showlegend=False
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Patient 2',
        x=[parameters_patient_2[2], parameters_patient_2[2]],
        y=[0, 0.12],
        mode='lines',
        line=dict(color=colors[1], dash='dash'),
        showlegend=False
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Patient 3',
        x=[parameters_patient_3[2], parameters_patient_3[2]],
        y=[0, 0.12],
        mode='lines',
        line=dict(color=colors[2], dash='dash'),
        showlegend=False
    ),
    row=2,
    col=1
)
fig.update_layout(
    xaxis_title='Initial drug amount',
    xaxis2_title='Compartment volume',
    xaxis3_title='Elimination rate',
    xaxis4_title='Sigma',
    yaxis_title='Probability',
    yaxis3_title='Probability',
    xaxis_range=[9.8, 10.2],
    xaxis2_range=[1.8, 2.2],
    xaxis4_range=[0.2, 0.6],
    template='plotly_white'
)
fig.show()

The dashed lines illustrate the elimination rates of the 3 simulated patients from above and the grey distributions illustrate the distirbution of patient parameters in the population. We omit plotting the amount, volume and noise parameters of the 3 patients as they are all the same and uniquely defined by the population distribution.

The inter-individual variability of drug concentration measurements in the population can be simulated by measuring the drug concentration for each sampled set of individual parameters, i.e. by simulating measurements for the simulated patients. Below we will illustrate the 5th to 95th percentile range of the population distribution together with the measurements of the 3 patients from above.

# Simulate population distribution of measurements
pop_measurements = np.empty(shape=(n_ids, n_times))
for idd, patient_parameters in enumerate(individual_parameters):
    result = mechanistic_model.simulate(patient_parameters[:-1], times)[0]
    pop_measurements[idd] = error_model.sample(
        patient_parameters[-1:], result)[:, 0]

# Visualise population model (measurement space)
fig = go.Figure()

# Plot 5th to 95th percentile of population distribution
fifth = np.percentile(pop_measurements, q=5, axis=0)
ninety_fifth = np.percentile(pop_measurements, q=95, axis=0)
fig.add_trace(go.Scatter(
    x=np.hstack([times, times[::-1]]),
    y=np.hstack([fifth, ninety_fifth[::-1]]),
    line=dict(width=1, color='lightgrey'),
    fill='toself',
    name='Population model',
    text=r"90% bulk probability",
    hoverinfo='text',
    showlegend=True
))

# Plot patient measurements
fig.add_trace(go.Scatter(
    x=measurement_times,
    y=measurements_1,
    mode='markers',
    marker_color=colors[0],
    name='Meas. patient 1'
))
fig.add_trace(go.Scatter(
    x=measurement_times,
    y=measurements_2,
    mode='markers',
    marker_color=colors[1],
    name='Meas. patient 2'
))
fig.add_trace(go.Scatter(
    x=measurement_times,
    y=measurements_3,
    mode='markers',
    marker_color=colors[2],
    name='Meas. patient 3'
))
fig.update_layout(
    xaxis_title='Time',
    yaxis_title='Drug concentration',
    template='plotly_white'
)
fig.show()

For details on how to implement a chi.PopulationModel and many other details concerning population models in Chi, the API reference Population Models.

1.6. Hierarchical inference

The simulation of population models is interesting in its own right and may, for example, be used to understand the inter-individual variation of the modelled dynamics once estimates for the population parameters exist. However, as claimed earlier, the population model also allows us to infer parameters from measurements that originate from different, non-identical entities. In particular, the population model allows us to define a hierarchical log-posterior from which we can estimate the individual parameters and the population parameters, simultaneously. To this end, let us first define the hierarchical log-likelihood which is a straightforward extension of the above introduced log-likelihood

\[\log p(\mathcal{D}, \Psi | \theta) = \sum _i \log p(\mathcal{D} _i | \psi _i) + \sum _i \log p(\psi _i | \theta).\]

Here, \(\mathcal{D}=\{\mathcal{D}_i\}\) are the data and \(\mathcal{D}_i = \{(y_{ij}, t_{ij})\}\) are the measurements of individual \(i\). Just like in section 1.4, we use \(j\) to index measurement events. \(\Psi = \{ \psi _i \}\) denotes the bottom-level parameters across all individuals with \(\psi _i\) being the parameters of individual \(i\). Thus, the hierarchical log-likelihood has two contributions: 1. a contribution from the cumulative log-likelihood of the bottom-level parameters to describe the observations; and 2. a contribution from the log-likelihood of the population parameters to describe the distribution of the bottom-level parameters. In Chi a hierarchical log-likelihood may be defined using chi.HierarchicalLogLikelihood with a list of bottom-level chi.LogLikelihood instances and a chi.PopulationModel.

# Define individual log-likelihoods for patients
log_likelihood_1 = chi.LogLikelihood(
    mechanistic_model, error_model, measurements_1, measurement_times)
log_likelihood_2 = chi.LogLikelihood(
    mechanistic_model, error_model, measurements_2, measurement_times)
log_likelihood_3 = chi.LogLikelihood(
    mechanistic_model, error_model, measurements_3, measurement_times)

# Define hierarchical log-likelihood
log_likelihood = chi.HierarchicalLogLikelihood(
    log_likelihoods=[log_likelihood_1, log_likelihood_2, log_likelihood_3],
    population_model=population_model)

The hierarchical log-likelihood is a function of both \(\Psi\) and \(\theta\), so in principle we could find maximum likelihood estimates by maximising \(\log p(\mathcal{D}, \Psi | \theta)\), just like in section 1.4. However, maximising \(\log p(\mathcal{D}, \Psi | \theta)\) will in general not find estimates of the data-generating \(\Psi\) and \(\theta\) for reasons that deserve a more thorough discussion than possible in this overview. Observe, however, that the 2. term of the hierarchical log-likelihood receives larger contributions for narrow population distributions. This introduces a bias towards understimating the population variance which affects the estimates of the population parameters. In addition the individual parameters will be biased away from the data-generating parameters towards the modes of the population distribution (a.k.a. shrinkage). In some cases these biases are not a concern, especially when the measurements leave very little uncertainty about the bottom-level parameters. In general, they will, however, influence the inference results.

A partial remedy for these shortcomings is Bayesian inference, where analogously to above we can use Bayes’ rule to define a hierarchical log-posterior

\[\log p(\Psi , \theta | \mathcal{D}) = \log p(\mathcal{D}, \Psi | \theta) + \log p(\theta ) + \text{constant},\]

where \(\log p(\theta )\) is the log-prior of the population parameters. The posterior distribution \(p(\Psi , \theta | \mathcal{D})\) is able to mitigate the biases of the hierarchical log-likelihood by inferring a distribution of likely parameters. This will not remove the biases entirely, but it will make sure that the data-generating parameters are captured by the posterior distribution (as long as the prior choice is appropriate).

In Chi we can define a hierarchical log-posterior using chi.HierarchicalLogPosterior which takes a chi.HierarchicalLogLikelihood and a chi.LogPrior as inputs. The posterior distribution can then be inferred using e.g. MCMC sampling, just like in section 1.4.2. Below we infer the parameters of the 3 patients from above using hierarchical inference and compare the posteriors to the data-generating parameters.

# Define hierarchical log-posterior
log_prior = pints.ComposedLogPrior(
    pints.LogNormalLogPrior(1, 1),      # Initial drug amount
    pints.LogNormalLogPrior(1, 1),      # Compartment volume
    pints.GaussianLogPrior(-1, 1),      # Log mean elimination rate
    pints.LogNormalLogPrior(-1, 0.4),   # Log std. elimination rate
    pints.LogNormalLogPrior(1, 1)       # Sigma
)
log_posterior = chi.HierarchicalLogPosterior(
    log_likelihood=log_likelihood, log_prior=log_prior)

# Infer posterior
controller = chi.SamplingController(log_posterior, seed=42)
controller.set_n_runs(1)
controller.set_parallel_evaluation(False)
controller.set_sampler(pints.NoUTurnMCMC)
n_iterations = 2000
posterior_samples = controller.run(n_iterations, log_to_screen=True)

# Discard warmup iterations
warmup = 500  # This number is not arbitrary but was carefully chosen
posterior_samples = posterior_samples.sel(draw=slice(warmup, n_iterations))

# Visualise posteriors
fig = make_subplots(rows=2, cols=2, shared_yaxes=True)
fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['Pooled Initial drug amount'].values.flatten(),
        histnorm='probability',
        marker_color='lightgrey',
        showlegend=False
    ),
    row=1,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[population_parameters[0]]*2,
        y=[0, 0.08],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=1,
    col=1
)

fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['Pooled Compartment volume'].values.flatten(),
        histnorm='probability',
        marker_color='lightgrey',
        showlegend=False
    ),
    row=1,
    col=2
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[population_parameters[1]]*2,
        y=[0, 0.08],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=1,
    col=2
)

fig.add_trace(
    go.Histogram(
        name='Patient 1 Post. samples',
        x=posterior_samples['global.elimination_rate'].values[
            0, :, 0].flatten(),
        histnorm='probability',
        showlegend=False,
        marker=dict(color=colors[0], opacity=0.7)
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Histogram(
        name='Patient 1 Post. samples',
        x=posterior_samples['global.elimination_rate'].values[
            0, :, 1].flatten(),
        histnorm='probability',
        showlegend=False,
        marker=dict(color=colors[1], opacity=0.7)
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Histogram(
        name='Patient 1 post. samples',
        x=posterior_samples['global.elimination_rate'].values[
            0, :, 2].flatten(),
        histnorm='probability',
        showlegend=False,
        marker=dict(color=colors[2], opacity=0.7)
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Patent 1 data-gen. parameters',
        x=[parameters_patient_1[2]]*2,
        y=[0, 0.06],
        mode='lines',
        line_color='darkblue',
        showlegend=False
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Patent 2 data-gen. parameters',
        x=[parameters_patient_2[2]]*2,
        y=[0, 0.06],
        mode='lines',
        line_color='darkred',
        showlegend=False
    ),
    row=2,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Patent 3 data-gen. parameters',
        x=[parameters_patient_3[2]]*2,
        y=[0, 0.06],
        mode='lines',
        line_color='darkgreen',
        showlegend=False
    ),
    row=2,
    col=1
)

fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['Pooled Sigma'].values.flatten(),
        histnorm='probability',
        marker_color='lightgrey',
        showlegend=False
    ),
    row=2,
    col=2
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[population_parameters[4]]*2,
        y=[0, 0.06],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=2,
    col=2
)

fig.update_layout(
    xaxis_title='Initial drug amount',
    xaxis2_title='Compartment volume',
    xaxis3_title='Elimination rate',
    xaxis4_title='Sigma',
    yaxis_title='Probability',
    yaxis3_title='Probability',
    template='plotly_white',
    bargap=0,
    bargroupgap=0,
    barmode='overlay'
)
fig.show()

In addition to the data-generating parameters of the patients, the hierarchical inference also provides estimates for the population-level parameters. Below we only illustrate the log mean and the log standard deviation of the elimination rate in the population, because the other parameters are pooled and therefore the population parameters are equivalent to the bottom-level parameters.

fig = make_subplots(rows=1, cols=2, shared_yaxes=True)
fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['Log mean Elimination rate'].values.flatten(),
        histnorm='probability',
        marker_color='lightgrey',
        showlegend=False
    ),
    row=1,
    col=1
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[population_parameters[2]]*2,
        y=[0, 0.08],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=1,
    col=1
)

fig.add_trace(
    go.Histogram(
        name='Posterior samples',
        x=posterior_samples['Log std. Elimination rate'].values.flatten(),
        histnorm='probability',
        marker_color='lightgrey',
        showlegend=False
    ),
    row=1,
    col=2
)
fig.add_trace(
    go.Scatter(
        name='Data-generating parameters',
        x=[population_parameters[3]]*2,
        y=[0, 0.08],
        mode='lines',
        line_color='black',
        showlegend=False
    ),
    row=1,
    col=2
)

fig.update_layout(
    xaxis_title='Log mean elimination rate',
    xaxis2_title='Log std. elimination rate',
    yaxis_title='Probability',
    template='plotly_white'
)
fig.show()

Note that in this case the posteriors of \(\mu _{k_e}\) and \(\sigma _{k_e}\) are largely dominated by the prior distribution as 3 patients are not informative enough to substantially influence the posterior distribution.

This concludes the quick overview of chi. You now know how to use Chi to simulate mechanistic model outputs, measurements and population models. You also know how to use chi to infer model parameters using maximum likelihood estimation or Bayesian inference, and you know how to use Chi to do hierarchical inference. There are a few things that the quick overview has not touched on. The two most interesting things being: 1. the chi.ProblemModellingController which is a convenience class that helps you to build your models more easily, especially when measurement times and dosing regimens vary across individuals; 2. the CovariatePopulationModel which allows you to define population models where some of the inter-individual is explained by covariates.

We hope you enjoyed this overview and have fun working with Chi!