Introduction

Cancer is a leading cause of death worldwide with the lowest success rate of clinical trials among all complex diseases. In an analysis of clinical trials from 2000 to 2015, the overall probability (defined by Wong et al. 1) of a drug successfully moving from phase I to approval was merely 3.4% in oncology, but for trials that used biomarkers for patient selection, the probability of success was significantly improved1. With increasing number of newly discovered drugs and potential biomarkers to investigate, it is extremely difficult to test and compare all dose levels, therapy combinations, and predictive biomarkers for each cancer type via clinical trials, which necessitates development of computational tools to accelerate the process.

Since 1990s, mathematical models have and continue to play important roles in drug development as a cost-efficient tool to inform clinical trial design (i.e., model-informed drug development or MIDD)2. Semi-mechanistic approaches like pharmacokinetic-pharmacodynamic (PKPD) models started first to accompany regulatory submissions, and, with advancing mechanistic understanding of pathophysiology and increasing computational power, mechanistic models, such as physiologically-based pharmacokinetic (PBPK) and quantitative systems pharmacology (QSP) models, were developed2,3. From 2013 to 2020, the US Food and Drug Administration has received a rising number of new drug applications with the support of QSP models, more than one fifth of which were for oncologic diseases4. Therefore, hypothesis-driven mechanistic QSP modeling has begun to play a critical role in predicting effectiveness of newly discovered drugs and determining the optimal dosage to assist clinical trial design via clinical trial simulation (i.e., in silico/virtual clinical trial)4,5,6,7 with virtual patients.

Virtual patients are usually defined as model parameterizations that generate physiologically plausible outputs8. Parameters with physiological or biological definitions should also be confined by the experimentally and clinically observed values. By generating a virtual patient population with similar characteristics to the target patient cohort, mechanistic models can compare different therapy combinations and potential biomarkers for patient stratification9. In the past few years, virtual patients in immuno-oncology are commonly generated via random sampling from chosen distributions10,11 or by models whose variables can be relatively easy to measure in clinical settings, such as imaging-based models12. Nonetheless, the strong inter-patient, inter-tumoral, and intra-tumoral heterogeneities in cancer require large clinical datasets to determine the physiological plausibility of randomly generated virtual patients. This challenge may be resolved by emerging multi-omics data13,14, which involve a large number of molecular data that characterize the tumor microenvironment in individual patients. In parallel with the effort to generate virtual patients that resemble real patients’ characteristics, digital twins are being developed in precision oncology with a goal to monitor and optimize treatment for individual patients through personalized models15. In this perspective, we aim to share our experience in generating virtual patients based on multi-omics analyses and discuss current efforts in developing realistic virtual patients and digital twins, which share a similar definition but are usually generated for different goals in immuno-oncology.

Virtual patients for clinical trial simulation

As each virtual patient is represented by a unique set of model parameterization16, it is critical to define the physiologically plausible ranges for model parameters and variables. In immuno-oncology, this is particularly challenging because: (1) the mechanisms involved in the cancer-immunity cycle17,18, including mechanism of action (MoA) of newly discovered therapeutics, are sometimes not fully understood (thus their effects are approximated by Hill functions rather than detailed biochemical equations), (2) variables, such as immune cell densities and cytokine levels, are subject to strong inter-individual and spatio-temporal heterogeneity, and (3) most of these parameters and variables are either unavailable, or difficult to measure, allometrically scaled from animal models, or only measured at 1-3 time points throughout a clinical study. Perhaps the most studied therapeutic area in virtual patient generation is cardiovascular diseases, where measurements (e.g., heart rate, cardiac output) are relatively easy to take and mechanisms are better understood than those in cancer19. Despite the challenges in immuno-oncology, Craig and colleagues have summarized the steps to conduct virtual clinical trials11 and demonstrated the approaches to generate virtual patients using a tumor growth model10. Cheng et al. also discussed key considerations during development of QSP models and virtual patient algorithms16. In parallel, our group has also applied various virtual patient generation methods to a quantitative systems pharmacology model for immuno-oncology (QSP-IO)20, in subsequent models, using data from multiplex digital pathology and genomic analysis. In this section, we discuss the challenges met during the QSP-IO model development.

A sequence of QSP-IO models with progressively more detail of the tumor immune microenvironment (TiME), such as cell types and cytokines, have been developed by our group with a goal to predict effectiveness of immune checkpoint inhibitors in combination with other therapies in multiple cancer types20,21,22,23,24,25,26,27. The latest model version was used for a retrospective analysis of anti-PD-L1 treatment in non-small cell lung cancer28, as well as a prospective prediction for the effectiveness of a masked antibody in triple-negative breast cancer29. In another study of metastatic triple-negative breast cancer, the model was expanded to account for up to 3 metastatic lesions, which were parameterized to transcriptomic data from lung and other metastases (i.e., liver, and bone) of breast tumors30. Furthermore, to account for spatio-temporal heterogeneity, the QSP model was also integrated with an agent-based model (spQSP-IO), which was calibrated by multiplex digital pathology31,32,33 and spatial transcriptomics34.

As the first step of generating a virtual patient population, a subset of model parameters is selected that best represent the inter-individual heterogeneity and randomly generate their values via Latin Hypercube Sampling. Some studies assume uniform distribution for all parameters (i.e., no prior information) and set an upper and a lower boundary to randomly generate parameter values, relying on subsequent algorithms to filter out virtual patients whose model-predicted characteristics (e.g., blood volume, heart rate in a cardiovascular disease model; tumor size, T cell level in a tumor growth model) fall out of the physiologically plausible range35,36. In QSP-IO, however, the main outputs of interest are: (1) tumor size for response status prediction, and (2) immune profiles, such as intratumoral CD8, CD4, FoxP3 T cell density, and receptor/ligand expression level, such as PD-L1 and CTLA-4, for biomarker analysis28. These variables often have wide ranges observed in patients37,38,39. For example, in a digital pathology analysis of tumor tissue samples from 43 patients with breast cancer, the density of CD8 T cells can differ with more than 3 orders of magnitude38. Therefore, filtering out virtual patients based on their T cell levels is not an effective method especially when other clinical measurements (e.g., cytokine profile) are unavailable to further narrow down the physiologically plausible domain.

In our model and other QSP studies in immuno-oncology, parameter distributions are often estimated by published experimental or clinical data27,28,29,30,40,41. Lognormal distribution is commonly assumed for physiological/biological parameters42. Parameters that cannot be directly measured or have limited availability from the literature were calibrated by iterations of clinical trial simulation. For each iteration, at least 1000 virtual patients were randomly generated to calculate the outputs of interest. Then, parameters were adjusted by comparing medians of model outputs to clinically measured values. This is a time-consuming step but necessary due to the nonlinear nature of the model, in which case, median parameter values do not correspond to median model output values.

QSP models usually consist of hundreds of cellular/molecular species. Therefore, it is challenging to find the initial conditions for all the model variables, which correspond to the patient status at the beginning of the drug administration. To this end, with each randomly sampled parameter set, we first initialize the model with a single cancer cell, a baseline level of cytokines, naïve T cells, antigen-presenting cells, and cell surface molecules (i.e., immune checkpoints, major histocompatibility complex (MHC), co-stimulatory ligands and receptors), and set the other variables to zero. Measurements from healthy individuals can assist estimation of these baseline patient characteristics43,44. In addition, a pre-treatment tumor size is randomly assigned to each virtual patient. At the end of the simulation, the model outputs at the time point when the pre-treatment tumor size is reached are considered the patient’s pre-treatment characteristics, which set the initial conditions for the clinical trial simulation.

With the unprecedented increase in omics data on specific cancer types from collaborative studies, such as the TCGA45, AURORA46, Human Tumor Atlas Network (HTAN)47, and iAtlas48, it is possible to use the immune cell proportions derived from omics data for virtual patient generation. In our recent QSP studies28,29, we selected virtual patients whose pre-treatment characteristics statistically matched the patient data using the Probability of Inclusion by Allen et al. 35. Figure 1 A compares the distribution of three pre-treatment immune subset ratios in the plausible and virtual patients from our NSCLC model28 with that in the real patient data on lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) from iAtlas database. Plausible patients (i.e., virtual patient candidates) refer to those randomly generated by calibrated parameter distributions, from which a virtual cohort was selected by the Probability of Inclusion. In short, the Probability of Inclusion (\({P}_{i}\)) of a plausible patient that corresponds to a model output \(\hat{y}\) was proportional to the ratio between the multivariate probability density function of the real patient data (\({p}_{d}\)) and that of the plausible patient cohort (\({p}_{{s}}\)) at \(\hat{y}\) (i.e., \({P}_{i}{\propto p}_{d}(\hat{y})/{p}_{{s}}\left(\hat{y}\right)\))35. When compared separately, the distributions of each immune subset ratio in the virtual patient cohort are statistically the same as those in the real patient data according to Kolmogorov-Smirnov test (Fig. 1a).

Fig. 1: Comparison of immune cell subset ratio distributions and correlations between the virtual patients and real patient data.
figure 1

Immune cell subset ratios were calculated by immune cell proportions from the iAtlas database. Natural-log transformation was performed. a The 25th, 50th, and 75th percentiles are encoded by box plots with whiskers that define 1.5 times the interquartile range away from the bottom or top of the box. b Correlations between each pair of immune subset ratios in the virtual population and the real patient data.

Figure 1b shows the correlations between each pair of the three ratios (i.e., CD8/CD4, CD8/Treg, and M1/M2 tumor-associated macrophages) in the virtual patients and the iAtlas data. When compared using Fisher’s transformation and Z-test, the correlation coefficient between CD8/CD4 and CD8/Treg ratios in the virtual patients is significantly different from that in the iAtlas data, while the correlations between the other two pairs are similar. We hypothesized that this difference is due to the outliers in the plausible patient cohort, particularly those with low CD8/Treg and CD8/CD4 ratios in Fig. 1b, which can be over-selected by the Probability of Inclusion due to their low probability density in the randomly sampled plausible patient cohort (\({p}_{s}\to 0\)) yet finite \({p}_{d}\).

Omics data also allowed us to investigate the behavior of metastatic lesions. Given the limited availability of immune cell abundance data for metastatic tumors directly from the literature or existing databases, we used RNA-seq data on breast cancer metastases generated by Siegel et al. 49. in our recent study of metastatic breast cancer30. Using immune cell deconvolution methods EPIC50 and quanTIseq51, we were able to estimate the immune cell proportions and calculate the ratios of immune cell subset proportions between primary breast tumors and specific metastatic sites (e.g., lung, liver, and bone). We then used the abundance estimates of T cells, dendritic cells, and macrophages from lung, liver, and bone metastases of breast tumors to inform virtual patient generation. More specifically, distributions of virtual patient parameters, such as the recruitment rates of immune cells to metastatic tumors, were calibrated such that the ratios of immune cell subset level between the metastatic lesions and the primary tumor in the model were consistent with estimates obtained from RNA-seq data using deconvolution methods.

Interplay between virtual patients and digital twins

Another concept related to virtual patients is digital twins, whose definition may differ across research fields. In drug development, virtual patients aim to predict effectiveness of treatments on the population level, such as newly discovered drugs and novel drug combinations, and inform future clinical trial design16. On the other hand, health or medical digital twins are often developed for personalized medicine, which emphasize the correspondence with their physical counterparts in the real world52,53,54. In practice, virtual patients can be generated from data compiled from multiple patients (e.g., summary statistics), which provides a realistic representation of a complex disease and allows clinical trial simulation to make population level predictions and assist drug developers55. Digital patients/twins are often generated by measurements from individual patients, assuming periodical updates with new measurements in the future and aiming to find the best option of patient care for clinicians56. Therefore, while digital twin is an individualized patient model, virtual patient could be a more generic patient model.

Wright and Davidson57 discussed optimal properties of a model and data required to build digital twins. The model should: (1) include sufficient mechanistic details so that parameters can be periodically updated with new measurement data; (2) make accurate predictions with new parameter values; and (3) be computationally efficient to make on-time predictions for its purpose. The authors also discussed the possibility of reducing the model to describe a local part of a complete system, or replacing a subpart of the model with a surrogate model (e.g., a data-driven model) to, for example, increase its computational efficiency57. When adopting these principles to build medical digital twins, An and Cockrell58 pointed out the challenges, including the lack of mechanistic understanding in biology, difficulties to acquire quantitative data to periodically update model parameters, and measurement uncertainties commonly observed in biological experiments. Meeting these challenges, Laubenbacher et al. proposed a 4-stage method specifically for generating digital twins of the human immune system59.

In the past few years, digital twins have gained attention in immuno-oncology research in parallel with virtual patient development15,60,61,62. Given the stricter requirements for digital twins, models are usually designed to accommodate the type of available data. For example, Jarrett et al. designed a 3-D mathematical model of tumor growth and response to treatments, which was governed by a partial differential equation, allowing for model calibration by routine MRI data from individual patients63. It was applied to generate digital twins for patients with triple-negative breast cancer to predict patient-specific response to a neoadjuvant chemotherapy60. The model-predicted post-treatment tumor size change from baseline and rate of pathological complete response were compared with the ARTEMIS trial for model validation by concordance correlation coefficients and the area under the receiver operator characteristic (auROC), respectively (pre-treatment and on-treatment MRI data used for digital twin calibration). The authors also observed different uncertainty levels in model predictions, and suggested that more frequent on-treatment measurements are warranted to periodically update digital twins, which is an important digital twin technique to increase model accuracy and lower uncertainty in model predictions60. Notably, the feedback loop between each patient’s new measurements and their digital twin model is a meticulous process and requires careful selection of methods for parameter inference, uncertainty quantification, etc64.

Although recent research on digital twins and virtual patients are mostly independent of each other potentially due to different study goals, it is important to note that they share similar challenges, and therefore one approach that tackles the challenges can be applicable to both fields59. Both virtual patients and digital twins are limited by the lack of sufficient patient data. Although summary statistics can provide guidance on generating physiologically plausible virtual patients, patient-specific data retain the information on correlations between variables (as demonstrated above), thus improving the model’s predictive power28,29. When time-course data (e.g., tumor size measurements over time) become available, virtual patients generated by pre-treatment data can be validated by subsequent measurements. Time-course data would also allow digital twin models to evolve with the patients and improve mechanistic understanding when comparing model prediction with clinical observation iteratively59,65. In addition, by treating a virtual patient population as digital twin candidates, digital twins can be derived for new patients by selecting the virtual patients whose model-predicted profiles match the time-course data (i.e., match virtual patients with their physical counterparts in the real world to become digital twins). In this way, model parameters of a digital twin no longer need to be directly derived from clinical measurements, and multiple digital twin candidates can be selected from a virtual patient population to quantify uncertainties in model predictions, such as in the analysis of non-Hodgkin lymphoma by Susilo et al. 66.

When digital twin models are built on data collected from a specific patient group, they can be later repurposed to generate a virtual patient population that shares similar traits, which can be used to make population level predictions, such as the effectiveness of newly discovered drugs targeting the same patient group. Tivay et al. suggested using compressed latent parameterization to generate realistic virtual patients from parameter values fitted to real patient data (i.e., digital twins)67. After constructing a latent parameter space from virtual patients fitted to patient-specific data (i.e., digital twins), new virtual patients can be generated from the latent space. In fact, we have applied this approach to account for the inter-patient heterogeneity in pharmacokinetic (PK) parameters in our NSCLC study28, which resulted in PK profiles in the virtual patient population that were comparable to that in real patients. Given the sparse public data that are available from oncology trials, methodology research on generating new realistic virtual patients from digital twins can provide an alternative approach to challenge.

Discussion

Generating physiologically plausible virtual patients is critical for making accurate prediction of the effectiveness of novel treatments using mechanistic models. We have discussed the current challenges faced by modelers and our approaches to these challenges. We emphasize the importance of public databases like iAtlas that contain correlated variables on the patient level (e.g., data corresponding to each patient). Further, since new methods created to generate virtual patients and medical digital twins are usually demonstrated in cardiovascular diseases and fields other than immuno-oncology68, this field can benefit from future research on the applicability and robustness of methods that have been proven useful in other therapeutic areas. Moreover, we raise the possibility of generating digital twins from virtual patients and vice versa, facilitating collaboration between researchers in these two fields.