 Research
 Open Access
 Published:
Generation of digital patients for the simulation of tuberculosis with UISSTB
BMC Bioinformatics volume 21, Article number: 449 (2020)
Abstract
Background
The STriTuVaD project, funded by Horizon 2020, aims to test through a Phase IIb clinical trial one of the most advanced therapeutic vaccines against tuberculosis. As part of this initiative, we have developed a strategy for generating in silico patients consistent with target population characteristics, which can then be used in combination with in vivo data on an augmented clinical trial.
Results
One of the most challenging tasks for using virtual patients is developing a methodology to reproduce biological diversity of the target population, ie, providing an appropriate strategy for generating libraries of digital patients. This has been achieved through the creation of the initial immune system repertoire in a stochastic way, and through the identification of a vector of features that combines both biological and pathophysiological parameters that personalise the digital patient to reproduce the physiology and the pathophysiology of the subject.
Conclusions
We propose a sequential approach to sampling from the joint features population distribution in order to create a cohort of virtual patients with some specific characteristics, resembling the recruitment process for the target clinical trial, which then can be used for augmenting the information from the physical the trial to help reduce its size and duration.
Background
It is estimated that one quarter of the world population is infected with (TB). Although the disease is preventable and treatable, about one and a half million people die annually from it, effectively placing TB as the first infectious cause of death. Due to person to person infection and treatment mismanagement, (MDR) TB continues to emerge, increasing the complexity in treatment and thus potentially worsening the transmission rate. There is a growing awareness that TB can be effectively fought only working globally, starting from countries like India, where the infection is endemic [1].
Once a person is diagnosed with TB, one of the most critical issues is the duration of the therapy, because of the high costs involved, the increased chances of noncompliance (which increase the probability of developing an MDR strain), and the time the patient is still infectious to others. One exciting possibility to shorten the duration of the therapy are novel hostreaction therapies (HRT), as an adjuvant for antibiotic therapy. Typical endpoints in the clinical trials for HRTs are time to sputum culture conversion, and incidence of recurrence. While for the first it is in some cases possible to have a statistically powered evidence for efficacy in a phase II clinical trial, recurrence almost always requires a phase III clinical trial with thousands of patients involved, and huge costs.
The in silico trials for tuberculosis vaccine development (STriTuVaD) project is an EU funded, multidisciplinary consortium testing the RUTI vaccine in a Phase IIb clinical trial. RUTI^{®} antitubercular vaccine, provided by Archivel Farma S.L, is a polyantigenic liposomal vaccine containing fragments of Mycobacterium tuberculosis cells, currently being developed as therapeutic vaccine in patients with pulmonary tuberculosis. The vaccine, shown to be one of the most advanced therapeutic vaccines against drug sensitive TB and MDRTB, has already been studied in healthy volunteers and for the prevention of active TB in patients with latent TB [2].
To help in this development, we extend Universal Immune System Simulator (UISS) [3, 4] to include the relevant determinants of such clinical trial, we establish its predictive accuracy against the individual patients recruited in the trial, use it to generate digital patients, predict their response to the hostreaction therapy being tested, and combine them to the observations made on physical patients using a new in silicoaugmented clinical trial approach that uses a Bayesian adaptive design. This approach, where found effective could drastically reduce the cost of innovation in this critical sector of public healthcare.
To reproduce biological the diversity of the subjects to be simulated, an appropriate strategy for the generation of libraries of digital patients is developed by identifying a vector of features involving both biological and pathophysiological parameters, facilitating the personalisation of the digital patient.
In this paper we sketch the strategy we adopt to generate the cohort of digital patients, and show some preliminary results about the dynamics of TB on a subset of these patients. First, we briefly describe UISS and its extension to TB.
Extending UISS to track TB
We will briefly describe here the UISS computational framework and its extension to model tuberculosis, UISSTB. The interested reader can find more detail in [5].
UISS is a multiagent framework for the simulation of the immune system dynamics that can be extended to track specific diseases and related treatments. Unlike classical topdown approaches, where mean behaviours are modelled through systems of differential equations [6,7,8], agent based models and multiagent systems track individual entities. It is the interactions between these entities that can give rise to global nonlinear behaviours. UISS has been developed as a multiscale computer simulator of the immune system, as it takes into account both cellular and molecular entities and processes.
UISS has a proven track record, for instance it has been used for modelling the effects of a vaccine against the onset of mammary carcinoma [9, 10] and consequent lung metastases [11]; for the initial stages of atherosclerosis [12], for melanoma [3]; more recently, in the study of multiple sclerosis [4, 13] and for testing the efficacy of citrusderived adjuvants for influenza vaccines and human papilloma virus [14, 15]. For its use within STriTuVaD, we have extended UISS to include TB dynamics along with the artificial immunity induced by vaccination strategies as presented in [5].
In order to depict individuals, a vector of features comprising biological and pathophysiological parameters has been identified. The list of parameters, their relative range and units are displayed in Table 1.
Methods
In order to create an in silico patient, one needs to provide a single value for each feature. These values could be taken from individual physical patients; however, if a cohort of digital patients is to be produced, one should have a mechanism for producing as many different input vectors as needed, that are biological/physiological plausible. Formally, this requires the characterisation of the joint distribution of the inputs in the population. We have compiled typical values and standard deviations for each feature, providing a way to generate plausible values for each component at a time. Proceeding in this way would neglect the biological correlations between features and thus would not guarantee a physiologically plausible input vector. Hence, we must take into account these correlations. Given that we have 22 input variables, we should specify \(22 \times 21/2 = 231\) correlations. Using relevant literature [16, and references therein] and expert opinion, we have qualified these correlations, determining that all correlations are positive, but the correlation of IL10 with the rest of the features.
Formalising in silico profile generation
In theory, one could elicit the joint distribution of the features vector, i.e. describe mathematically how each feature relates to the others in a space of 22 dimensions; but this would be not only extremely difficult, but also time consuming and data demanding. Our approach is to rely on current mathematical biology consensus and use a Gaussian to represent the population distribution. The additional advantage of using this approach will be discussed in the next section.
Formally, we say that the vector \({\varvec{f}}\) = \(\left\{ {f_{1},\dots ,f_{d}}\right\}\) follows a dvariate Gaussian distribution with joint probability density function,
with mean \(\boldsymbol{\mu } = \left\{ {\mu _{1},\dots ,\mu _{d}}\right\}\) and covariance matrix,
where,
So, if we are able to elicit a measure of correlation between two inputs, we can calculate their covariance.
The elements in the diagonal, \(\sigma ^2_i\) are the marginal variances of each element, \(f_i\), and \(\mu _i\) the corresponding marginal mean. As mentioned above, we already have compiled a list with these values, so we have elicited values for \(\boldsymbol{\mu }\) and the diagonal elements of \(\Sigma\), \(\sigma ^2_i\).
Cohort generation
Once \(\boldsymbol{\mu }\) and \(\Sigma\) have been elicited, generating an in silico profile is a relatively trivial task: one must sample a point in the 22dimensional space, consistent with \(\hbox{N}_{22} ({\varvec{f}}  {\boldsymbol{\mu }}, {\Sigma })\). However, we can exploit the properties of the Gaussian distribution to produce a cohort consistent with some specific characteristics. Say, for instance, that our target population has a particular range of BL, we would like then to produce digital patients consistent with that specific profile. Formally, let \(f_1\) represent BL and \({\varvec{f}} _{1} = \left\{ {f_{2},\dots ,f_{22}}\right\}\), the rest of the features; we would like to sample from \(\hbox{N}_{21} ({\varvec{f}} _{1}  {f_1, \boldsymbol{\mu }}, {\Sigma })\), ie the conditional distribution of the rest of the features, given that BL has a specific value. This is a standard procedure, which can be readily implemented.
We can go further and sort the list of features according to either their importance in determining the profile of a patient, or to the precision of their elicited mean, variance and covariance, and then proceed to sample from the conditional distributions. In general, let \({\varvec{f}} _s\) denote the vector of features with prespecified values, so that \({\varvec{f}} = \left\{ {{\varvec{f}} _s, {\varvec{f}} _r}\right\}\), \({\varvec{f}} _s \in {\mathbb{R}}^{dq}\), where \({\varvec{f}} _r \in {\mathbb{R}}^q\) is the vector of free features.
The conditional distribution, \(p ({{\varvec{f}} _r} {{\varvec{f}} _s = {\varvec{a}}})= \hbox{N}_{q} ({\varvec{f}} _r  {\boldsymbol{\nu }}, {\Omega })\) with
where
\(\Omega\) the Schur complement of \(\Sigma _{rr}\) in \(\Sigma\). Judicious choice of \({\varvec{f}} _s\) and \({\varvec{f}} _r\) enables sampling sequentially, e.g. from least to most important feature.
Results
We created an R script [17] for the generation of digital patents, available from the corresponding author upon request. We report results from three groups of 15 patients with different profiles, each with fixed (Age, BMI and MtbSputum) to roughly represent different profiles in the population and initial bacterial load. Profile 1 has (35, 21.4, 15), Profile 2 (45, 28.2, 502), and Profile 3 (55, 31.8, 910), the full set of values can be obtained from the Additional file 1. These can be used as input to the UISSTB web interface, available from www.strituvad.eu (accessed on 28/07/20), by selecting the Tuberculosis disease model, hence accessible to any user with a conventional computer and access to the internet.
The GUI panel displays default values and admissible ranges for the vector of features parameters. Once the specific vector of features is completed, the user can click on the Submit button and a unique identification simulation number is assigned. The user can check the simulation status by clicking on the check status button, after selecting the appropriate simulation id. When the simulation is complete, the user can visualise results of immune system dynamics. In our case, the progression of each patient was simulated 50 times for 1 year, with levels of the various species recorded every 600 seconds. The data from each patient requires roughly 100 MB of disk storage.
We use the total (Ab) to exemplify some characterisation of the output; e.g. Fig. 1 shows the total Ab count for one simulation of the 15 patients in Profile 1. In order to characterise the mean behaviour, we average the 50 repetitions per patient. Figure 2 depicts the median and quartiles for a selection of patients (columns) for each profile (rows). It is clear there is an increased variability around the main and secondary peaks; while levels consistently fall back to nought after roughly 16 days (3500 h). The distribution of time at the peak level is illustrated in Fig. 3, it occurs consistently within 112–116 days for all profiles, while Profile 3 shows a slightly increased variability.
Conclusions
UISSTB is a stateoftheart agent based model capable of tracking the dynamics of TB infection in humans. Individual digital patients are defined by a vector features, known to be fundamental in TB infection dynamics and normally measured clinically, hence often readily available.
Discussion
In order to produce virtual cohorts of patients, we propose a sequential approach based on a characterisation of the distribution of these features in the population of interest; the approach allows to fix any combination of features, enabling mimicking patient selection criteria, thus yielding a method for setting up augmented in silico clinical trials.
Availability of materials
The datasets generated and analysed during the current study are not publicly available due to size restrictions but are available from the corresponding author on reasonable request.
Abbreviations
 Ab:

Antibody count
 MDR:

Multidrug resistant
 STriTuVaD:

In silico trials for tuberculosis vaccine development
 TB:

Tuberculosis
 UISS:

Universal Immune System Simulator
References
 1.
WHO: Global tuberculosis report (2019).
 2.
Prabowo SA, Painter H, Zelmer A, Smith SG, Seifert K, Amat M, Cardona PJ, Fletcher HA. RUTI vaccination enhances inhibition of mycobacterial growth ex vivo and induces a shift of monocyte phenotype in mice. Front Immunol. 2019;10:894.
 3.
Pappalardo F, Forero IM, Pennisi M, Palazon A, Melero I, Motta S. SimB16: modeling induced immune system response against B16melanoma. PLoS ONE. 2011;6(10):26523.
 4.
Pennisi M, Russo G, Motta S, Pappalardo F. Agent based modeling of the effects of potential treatments over the blood brain barrier in multiple sclerosis. J Immunol Methods. 2015;427:6–12.
 5.
Pennisi M, Russo G, Sgroi G, Bonaccorso A, Parasiliti Palumbo GA, Mitra DK, Walker KB, Cardona PJ, Amat M, Viceconti M, Pappalardo F. Predicting the artificial immunity induced by RUTI® vaccine against tuberculosis using universal immune system simulator (UISS). BMC Bioinform. 2019;20:1–10.
 6.
Ragusa MA, Russo G. ODEs approaches in modeling fibrosis: comment on “Towards a unified approach in the modeling of fibrosis: a review with research perspectives” by Martine Ben Amar and Carlo Bianca. Phys Life Rev. 2016;17:112–3.
 7.
Castiglione F, Pappalardo F, Bianca C, Russo G, Motta S. Modeling biology spanning different scales: an open challenge. BioMed Res Int. 2014;2014:1–9.
 8.
Pappalardo F, Pennisi M, Ricupito A, Topputo F, Bellone M. Induction of Tcell memory by a dendritic cell vaccine: a computational model. Bioinformatics. 2014;30(13):1884–91.
 9.
Pappalardo F, Motta S, Lollini PL, Mastriani E. Analysis of vaccine’s schedules using models. Cell Immunol. 2006;244(2):137–40.
 10.
Palladini A, Nicoletti G, Pappalardo F, Murgo A, Grosso V, Stivani V, Ianzano ML, Antognoli A, Croci S, Landuzzi L, De Giovanni C, Nanni P, Motta S, Lollini PL. In silico modeling and in vivo efficacy of cancerpreventive vaccinations. Cancer Res. 2010;70(20):7755–63.
 11.
Pennisi M, Pappalardo F, Palladini A, Nicoletti G, Nanni P, Lollini PL, Motta S. Modeling the competition between lung metastases and the immune system using agents. BMC Bioinform. 2010;11(Suppl 7):13.
 12.
Pappalardo F, Musumeci S, Motta S. Modeling immune system control of atherogenesis. Bioinformatics. 2008;24(15):1715–21.
 13.
Pappalardo F, Russo G, Maimone D, Pennisi M, Sgroi G, Alessandro G, Pappalardo F, Russo G, Pennisi M, Sgroi G, Alessandro G, Palumbo P, Motta S, Maimone D. Agent based modeling of relapsing multiple sclerosis: a possible approach to predict treatment outcome. In IEEE international conference on bioinformatics and biomedicine (BIBM). 2018;1380–5.
 14.
Pappalardo F, Fichera E, Paparone N, Lombardo A, Pennisi M, Russo G, Leotta M, Pappalardo F, Pedretti A, De Fiore F, Motta S. A computational model to predict the immune system activation by citrusderived vaccine adjuvants. Bioinformatics. 2016;32(17):2672–80.
 15.
Pennisi M, Russo G, Ravalli S, Pappalardo F. Combining agent basedmodels and virtual screening techniques to predict the best citrusderived vaccine adjuvants against human papilloma virus. BMC Bioinform. 2017;18(S16):544.
 16.
MayerBarber KD, Andrade BB, Oland SD, Amaral EP, Barber DL, Gonzales J, Derrick SC, Shi R, Kumar NP, Wei W, Yuan X, Zhang G, Cai Y, Babu S, Catalfamo M, Salazar AM, Via LE, Barry CE III, Sher A. Hostdirected therapy of tuberculosis based on interleukin1 and type I interferon crosstalk. Nature. 2014;511(7507):99–103.
 17.
R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2020). Version 4.0.2.
 18.
Pennisi M, Juarez MA, Russo G, Viceconti M, Pappalardo F. Generation of digital patients for the simulation of tuberculosis with UISSTB. In: 2019 IEEE international conference on bioinformatics and biomedicine (BIBM), 2019;2163–2167.
Acknowledgements
This is an extended version of [18].
About this supplement
This article has been published as part of BMC Bioinformatics Volume 21 Supplement 17 2020: Selected papers from the 3rd International Workshop on Computational Methods for the Immune System Function (CMISF 2019). The full contents of the supplement are available at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume21supplement17.
Funding
Publication costs are funded by European Commission Commission under the Contract H2020SC12017 CNECT2, No. 777123. Authors of this paper acknowledge support from the STriTuVaD project, funded by the European Commission Commission and the Indian Department of Biotechnology under the Contract H2020SC12017 CNECT2, No. 777123. The information and views set out in this article are those of the authors and do not necessarily reflect the official opinion of the European Commission. Neither the European Commission institutions and bodies nor any person acting on their behalf may be held responsible for the use which may be made of the information contained therein.
Author information
Affiliations
Contributions
MAJ, MP and DK prepared the manuscript. FP, MP and GR designed and developed UISSTB. MAJ, DK, MV and CC contributed to the design of the analysis. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1:
Profile traces.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Juárez, M.A., Pennisi, M., Russo, G. et al. Generation of digital patients for the simulation of tuberculosis with UISSTB. BMC Bioinformatics 21, 449 (2020). https://doi.org/10.1186/s1285902003776z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1285902003776z
Keywords
 Agent based model
 In silico patient
 Sequential sampling
 Tuberculosis