Abstract
We developed a composite symptom score (CSS) representing disease-related symptom burden over time in patients with malignant pleural mesothelioma (MPM). Longitudinal data were collected from an open-label Phase IIB study in which 239 patients completed the validated MD Anderson Symptom Inventory for MPM (MDASI-MPM). A blinded, independent review committee of external patient-reported outcomes experts advised on MDASI-MPM symptoms to include in the CSS. Through iterative analyses of potential symptom-item combinations, 5 MPM symptoms (pain, fatigue, shortness of breath, muscle weakness, coughing) were selected. The CSS correlated strongly with the full MDASI-MPM symptom set (0.92–0.94) and the Lung Cancer Symptom Scale-Mesothelioma (0.79–0.87) at each co-administration of the scales. The CSS also had good sensitivity to worsening disease and global quality-of-life ratings. The MDASI-MPM CSS can be used as an outcome in MPM clinical trials, including in responder analyses and at the individual patient level. It is brief enough to administer frequently, including electronically, to better capture symptom trajectories during and after a trial and in clinical practice. As a single score, the CSS addresses multiplicity issues that can arise when several symptoms increase due to worsening disease. Our process can be adapted to produce a CSS for other advanced-cancer trials.
Similar content being viewed by others
Introduction
In oncology, stabilizing or reducing disease symptoms and improving functioning is inarguably beneficial for patients, especially those with late-stage disease. Agents that control disease progression could be expected to have beneficial effects on the various symptoms characteristic of that advanced disease. However, because each type of cancer could have a different set of associated symptoms, the severity of which will vary by disease stage, it is vital to distinguish disease-related symptoms from treatment-related symptomatic toxicities to the extent possible1.
Clinical trial endpoints that capture treatment benefits provide important information for multiple stakeholders—not only clinicians, patients, and caregivers, but also regulators, health technology assessment groups, and payers. If significant reduction or stabilization of disease-related symptoms, improvement of disease-related functional impairment, or decrease in treatment-induced toxicities compared with standard care or a competitor agent can be shown for a new therapy, these data could support regulatory agency approval and be possible candidates in a labeling claim or the supporting literature. Information on disease-related symptoms, symptomatic toxicities, functional impairment, and other aspects of health-related quality of life (QOL) can best be obtained from patient-reported outcomes (PROs)1,2,3,4.
It is well established that including PROs to capture the patient’s experience before, during, and after treatment is a vital component of the formal regulatory approval process for new therapies. In June 2021, the US Food and Drug Administration (FDA) issued draft guidance on the collection of core PROs in cancer clinical trials4. This draft guidance identified disease symptoms, symptomatic adverse events, and physical function as core outcomes of interest for FDA product review and presented characteristics necessary for a PRO tool to be considered “fit-for-purpose” for regulatory decision-making—meaning that the tool’s level of validation is sufficient to support its context of use5.
From a regulatory perspective, the following considerations are relevant in developing a PRO instrument: It must represent the target treatment group, must be enriched by qualitative research in the targeted population, must have acceptable psychometric properties, and must be able to capture expected clinical change; results must be easily interpretable by patients and clinicians; and the value that indicates a meaningful change in the score must have been examined6. Indeed, recent labels granted by the FDA show that PRO measures can be used to support labeling claims, so long as they are backed by evidence1,7,8.
Common symptoms experienced by patients with advanced cancer include pain, fatigue, muscle weakness, and lack of appetite2; however, at any given time, a patient might report only mild pain but distressingly high levels of other disease-related symptoms, such as shortness of breath or fatigue. It is problematic to analyze symptoms individually, because rating symptoms separately diminishes the statistical power to detect differences between groups. This contributes to sponsors’ concerns about using change in symptom burden as an endpoint in oncology clinical trials.
One solution is to develop a composite symptom score (CSS), sometimes called a total symptom score, comprising the key symptoms relevant to a given cancer and stage. These symptoms would be assessed individually in a clinical trial, but the CSS would be analyzed as a single variable that represents the construct of disease-related symptom burden. As a subset of the most salient and frequent symptoms reported by the targeted patient population, a CSS offers several advantages: (1) Inflation of alpha and loss of power caused by multiple comparisons of symptoms as individual outcomes are avoided, which might encourage the use of disease-related symptomatic impact as an outcome rather than as an exploratory endpoint9; (2) A CSS includes only a few items, significantly reducing patient assessment burden (requiring only 1–5 min to complete) and facilitating frequent electronic administration outside of the clinic, such that the trajectory of symptom status during and after completion of a clinical trial can be better captured; (3) A CSS could be used to represent disease-related symptom burden in responder analyses, which are encouraged by the FDA as a way to document change in PROs used as clinical outcome assessments3; and (4) A CSS would avoid the dilution effect that could occur if all symptoms of an instrument were included, because some disease-related symptoms will not be as prevalent or severe as others. Should between-group comparisons produce significant differences in the symptom burden represented by a CSS, additional analyses can be performed on the individual symptoms that contributed to the CSS. Composite symptom scores have been used successfully as endpoints in clinical trials for regulatory submissions10,11 and as secondary endpoints to compare treatments in other clinical studies11.
The aim of this study was to develop and evaluate a CSS in tandem with the validation of a new PRO measure, the MD Anderson Symptom Inventory for Malignant Pleural Mesothelioma (MDASI-MPM)12, for use as a potential disease-related symptom burden endpoint in pivotal trials. We describe how blinded, pooled data from a randomized Phase IIB clinical trial13 was used to develop the CSS, examine how this CSS performed longitudinally at baseline, during the trial, and at follow-up, and propose a value indicating a meaningful change in CSS score.
Methods
This project is based on a collaboration between a pharmaceutical sponsor and an academic organization. The initial agreement included the development and validation of a fit-for-purpose PRO measure, the MDASI-MPM, that aligned with sponsor’s development of an agent designed to treat malignant pleural mesothelioma (MPM), an aggressive, highly symptomatic, and rapidly fatal cancer of the lung pleura. Patients with MPM report multiple severe symptoms that impair function and reduce treatment tolerance. After the qualitative and quantitative activities were completed for the MDASI-MPM, additional work was undertaken to develop a single CSS comprising a limited subgroup of the most relevant disease-related symptoms endorsed by patients with MPM as being easily understood and relevant to their stage of disease. As such, the CSS could be used to represent change in symptom burden for this patient population.
MDASI-MPM development
The MDASI-MPM was developed from the core MD Anderson Symptom Inventory, a PRO instrument that measures 13 common cancer-related symptoms and 6 items describing symptom interference with functioning14. All symptom and interference items are rated on a 0–10 scale (from none to worst imaginable for symptom severity, and from none to completely interferes for interference items) with a recall period of the past 24 h. The 6 interference items can be divided into 2 subscales: a mood-related subscale comprising relations with others, enjoyment of life, and mood, and an activity-related subscale comprising ability to work, engage in daily activities, and walk. These 13 core symptoms and 6 interference items were derived from input from more than 500 patients with cancer of various types14 and also were debriefed and found relevant in the MPM patient sample15.
Disease-specific and treatment-specific symptom items can be added to the core MDASI’s 19 symptom severity and interference items to form new instruments specifically tailored to the disease or treatment of interest. A similar approach has been used with instruments such as the EORTC QLQ-C30 and the FACIT System16. Additional symptom items are identified through qualitative interviews (concept elicitation/cognitive interviews) of patients with the targeted disease, and the resulting instrument is validated in alignment with FDA guidance on the development of PRO instruments for use in regulatory evaluation and for labeling claims3. With this approach, the new instruments are offered as approximating the FDA’s intentions for the use of PROs as fit-for-purpose outcome measures for clinical assessment4.
The MDASI-MPM was developed with reference to the guidance described above. During the qualitative interviews, 6 additional MPM-specific symptom items (coughing, muscle weakness, feeling of malaise, chest heaviness or tightness, eye problems, and trouble with balance or falling) were identified; complete details of the qualitative phase of MDASI-MPM development are reported elsewhere15. The MDASI-MPM’s psychometric properties were evaluated in an open-label, blinded, randomized, multicenter Phase IIB trial in patients with MPM (described below)13; the instrument’s stability, reliability, and sensitivity to and responsiveness to changes in disease status were found to be excellent12.
MDASI-MPM composite symptom score (CSS) development
The final component of the MDASI-MPM’s development was to generate the MPM-specific CSS15. The process for developing the CSS included (1) formation of an independent review committee (IRC) of experts in measure development and regulatory considerations; (2) development of the statistical analysis plan for the CSS; (3) development of a plan to blind and pool data from the Phase IIB trial to shield the PRO development team from treatment-specific information; (4) longitudinal examination of the performance of the candidate CSS items based on data from the Phase IIB trial; and (5) IRC review and approval of the final CSS and recommendations for a meaningful change in CSS score were proposed.
The Phase IIB trial and subsequent development of the CSS were conducted under Bayer HealthCare AG Integrated Clinical Study Protocol No. BAY 94-9343/1574317, V1.0 dated August 17, 2015 (ClinicalTrials.gov Identifier #NCT02610140). This multisite, multinational study was approved by the institutional review boards or independent ethics committees of more than 60 participating sites (see Supplementary Table S1)13. Informed consent was obtained from all participants. This research was performed in accordance with the Declaration of Helsinki and relevant guidelines and regulations.
Independent review committee
In order to provide an independent and unbiased review of symptom items to be included in the CSS, the sponsor commissioned an IRC comprising PRO clinical trial and statistical experts (AB, ACD, JAS, TS). The IRC was convened by an outside contract research organization, Clinical Outcomes Solutions (www.clinoutsolutions.com), which coordinated the IRC’s virtual meetings and summarized recommendations. The IRC was chaired by Tara Symonds, Chief Science Officer of Clinical Outcomes Solutions.
CSS statistical analysis plan
A statistical analysis plan for developing the CSS (distinct from the statistical analysis plan in the Phase IIB study protocol13) was developed and revised with guidance from the IRC at its first meeting. In accordance with this plan, longitudinal symptom data from the MDASI-MPM psychometric validation study12 were summarized for the IRC. To determine which elements best captured the prevalence and severity of MPM, the most severe symptoms from that study at baseline and during treatment were described, both by mean severity and by the percentage of patients reporting a symptom as moderate or severe (represented by a score of ≥ 5 or ≥ 7 on that item, respectively). The identified symptoms then formed a pool of candidate items to be reviewed for inclusion in the CSS.
The ≥ 5 cutpoint for individual symptoms came from studies showing that “pain at its worst” is related to greater interference with function when rated ≥ 5 by cancer patients9,18,19. The choice of ≥ 7 to delineate a severe symptom was based on work by Serlin et al.18 showing that a cutpoint of 7 separates moderate from severe pain and Mendoza et al.20 showing that a cutpoint of 7 optimally differentiates between moderate and severe fatigue. A rating of ≥ 7 has been used in routine clinical practice to define severe pain and fatigue21 and in a large multicenter cooperative study to describe symptom prevalence22. (Other methods for determining cutpoints have been described23).
As recommended by the IRC, data for potential candidate items were regressed on the MDASI-MPM interference items to identify which symptoms predicted the instrument’s total symptom interference score—an approach we have used previously to categorize pain as mild, moderate, or severe on the 0–10 scale18,23. A backward regression and a stepwise regression were conducted on the symptom items at baseline and during treatment. Effect sizes were calculated for each item by comparing ratings at baseline and the safety follow-up, to determine how the candidate items changed over time in the entire sample. After reviewing these initial analyses, the IRC considered whether to exclude certain items from further consideration for the CSS on the basis of their low severity over time or ambiguity of meaning.
Once the candidate CSS items were determined, we used Cronbach coefficient alphas to estimate the internal consistency reliability of the CSS at baseline, on Day 1 of treatment Cycle 3, and at the safety follow-up. We examined test–retest reliability from assessments made between Cycle 2 Day 1 and Cycle 3 Day 1 (when we expected minimal changes), based on intraclass correlations of 0.70 or higher. To assess convergent validity, we correlated the CSS with the Lung Cancer Symptom Scale–Mesothelioma (LCSS-Meso), a valid and reliable QOL measure designed for patients with non-small cell lung cancer and modified for use in patients with MPM24,25,26,27. To evaluate the relationship between the CSS and the parent scale, we correlated the CSS with the full MDASI-MPM.
To evaluate known-group validity, we used independent-sample t-tests with Eastern Cooperative Oncology Group performance status (ECOG PS), a physician-rated measure of functional ability28, as the grouping variable. The ECOG PS ratings range from 0 (fully active; able to carry on all pre-disease performance without restriction) to 4 (completely disabled; cannot perform self-care; totally confined to bed or chair)28. We calculated CSS effect sizes between patients with fully active (ECOG PS = 0) versus restricted active (ECOG PS = 1) performance status. To establish sensitivity, we assessed whether the CSS could detect worsening of symptoms among patients with deteriorating performance status (as a clinical estimate of worsening disease status).
Phase IIB trial
The MDASI-MPM validation data used to derive the candidate CSS items and to evaluate the psychometric properties of various combinations of these items came from a randomized, open-label, active-controlled, 2-arm, multicenter Phase IIB trial evaluating the safety and efficacy of anetumab ravtansine (BAY 94-9343) versus vinorelbine (2:1)13 The MDASI-MPM was included in the trial in accordance with recommendations from the FDA in its PRO guidance3. Of the 248 randomized participants in that trial, 239 had nonmissing MDASI-MPM data and were included in the CSS development study. In the final study report for this trial13, no statistically significant between-group differences in MDASI-MPM outcomes were found.
Other measures used in the Phase IIB trial include the LCSS-Meso and ECOG PS. We also included the LCSS-Meso as a second measure of MPM-related QOL, using ratings on its single global QOL item as an anchor in the CSS validation. A no more than ± 9-point change on the LCSS-Meso QOL item has been reported as identifying no to minimal change (i.e., a no-change group)29. This item also asks patients to rate their overall QOL over the past week on a 100-mm visual analog scale. For this trial, ECOG PS was collected at each clinic visit and was used as a measure of disease severity.
The MDASI-MPM was completed at baseline, on Days 1 and 15 of each cycle up to 3 cycles, and on Day 1 of Cycles 4, 5, and 6. The LCSS-Meso was administered at baseline and on Day 1 of up to 6 treatment cycles; ECOG PS was assessed at baseline and at Day 1 of each cycle up to 6 cycles. Patients also completed the MDASI-MPM and LCSS-Meso at a safety follow-up, when most patients had disease progression, allowing for additional sensitivity estimates.
Blinding and pooling of data
All analyses used for developing the CSS were conducted using blinded, pooled trial data. Although the Phase IIB study was open-label and the academic organization was a study site, treatment-related data remained with the sponsor’s trial team and were kept from the IRC and the academic organization’s PRO experts and statistical team. Data from the Phase IIB study were stripped of treatment-identifying information immediately after database lock for the study’s primary analysis, and only this blinded dataset was sent to the statistical team developing the CSS.
Results
First IRC meeting
Selection of the initial candidate symptom items for the CSS was based on the psychometric analysis of the full dataset in the MDASI-MPM validation study12. The goal was to identify which MDASI-MPM symptom items best represented prevalence and severity (i.e., the percentage of patients reporting a symptom as moderate or severe, represented by a score of ≥ 5 or ≥ 7 on that item, respectively) before and during the trial. See Supplementary Tables S2 and S3 for select timepoints and Supplementary Figs. S1 and S2 for all symptoms. Backward and stepwise regression methods were used to regress the candidate items on the 2 MDASI-MPM interference subscales (mood-related and activity-related) to identify which symptoms predicted the instrument’s total MDASI symptom interference score at baseline and during treatment. Moderate to large standardized regression coefficient values (range 0.16–0.47) guided the selection of candidate symptom items for the composite symptom score. Effect sizes were calculated for each item by comparing ratings at baseline and at the safety follow-up, to determine how the candidate items changed over time for the entire sample. With the exception of coughing, which remained stable, effects sizes ranged from 0.18 to 0.60.
On the basis of the initial data review and discussion, the IRC excluded several MDASI-MPM items from further consideration for the CSS, including nausea, vomiting, sadness, distress, numbness, eye problems, difficulty remembering, and lack of appetite, as their severity over time in the Phase IIB trial was low. The remaining 8 symptoms—the MDASI core items pain, fatigue, and shortness of breath and the MPM-specific items malaise, muscle weakness, coughing, trouble with balance, and chest heaviness or tightness—were carried forward by the IRC as potential CSS candidate items.
Second IRC meeting
At its second meeting, the IRC discussed various CSS solutions (different combinations of the 8 symptom items identified at the first IRC meeting) and the combination of items that it would want to see tested for the finalization meeting. In evaluating the candidate CSS items, the IRC and the academic organization’s investigators were blinded to the treatment groups.
Although there was agreement on most of the 8 items chosen from the data-driven approach, the IRC recommended that that malaise be removed because it might be ambiguous for some patients to interpret. Trouble with balance and chest heaviness or tightness were removed because of low prevalence. The removal of MDASI-MPM items was therefore based on both qualitative (patient interview) and quantitative (psychometric) data and the clinical experience of IRC members.
Third IRC meeting
During the third meeting, the IRC recommended that the final CSS include pain, fatigue, shortness of breath, muscle weakness, and coughing, on the basis of the analyses comparing all candidate CSS items. The IRC also discussed what might constitute a clinically meaningful change in the CSS, as a starting point for refining evolving methods of interpreting what is meaningful to patients. Converging data from distribution and anchor-based analyses indicated that a 1-point change in the CSS was minimally important. The IRC final report suggested considering a more conservative 2-point change for regulatory use, and this 2-point change was used in the analysis of the Phase IIB study.
Psychometric properties of the final CSS
Reliability
Good internal consistency was seen for the chosen CSS solution: Cronbach alphas were 0.82 (95% CI 0.78–0.86) at baseline, 0.84 (95% CI 0.79 − 0.87) at Cycle 3 Day 1, and 0.80 (95% CI 0.73–0.86) at the safety follow-up. Test–retest reliability was assessed on a subset of patients by selecting timepoints when patients were clinically stable (Day 1 of Cycles 2 and 3). In addition, patients should have had minimal change in the QOL item of the LCSS-Meso at those timepoints (i.e., no more than a ± 9-point change on the LCSS-Meso QOL item between Cycles 2 and 3). Results (n = 82) showed that the intraclass correlation (0.84, 95% CI 0.78–0.88) was in line with psychometric recommendations for the proposed CSS4. See Supplementary Table S4 for item-item correlations.
Validity
The selected CSS items correlated with the LCSS-Meso (range 0.79–0.87), suggesting good concurrent validity (Table 1). As expected, the selected CSS also correlated with the full MDASI-MPM symptom items (range 0.92–0.94). The assessment of known-group validity was based on comparisons of fully active (ECOG PS = 0) and restricted active (ECOG PS = 1) groups. Significant differences in the CSS between the 2 groups were seen up to Cycle 3 Day 1 (all P < 0.05, 3 assessment timepoints), supporting the known-group validity of the CSS (Table 2).
Exploratory factor analysis using principal axis factoring showed a consistent single-factor solution explaining most of the common variance (range 55–61%) across all trial assessment points (Supplementary Table S5). This result suggests that the CSS items measure a common underlying construct of disease-related symptom burden.
Sensitivity
Sensitivity to symptom change with disease progression was assessed by using Cohen’s d to calculate the effect size of change in mean CSS between baseline and the safety follow-up (n = 50). At baseline, the mean CSS was 3.3 ± 2.1 on the 0–10 scale; by the safety follow-up, the mean CSS had worsened to 4.5 ± 2.3, for a mean difference of − 1.2 points (95% CI − 1.90 to − 0.54; P = 0.001) and an effect size of 0.57 (Fig. 1).
Proposed meaningful change in CSS
To estimate the meaningful change in the CSS for symptom worsening, we used ECOG PS as the anchor and stratified participants into 2 groups on the basis of change in ECOG PS between baseline and the safety follow-up. Because ECOG PS improved in only 3 patients, we combined the improved group and the no-change group (n = 42), and compared this combined group with the declining ECOG PS group (n = 50). Both the anchor-based and distribution-based results (Table 3) suggested that a 1-point increase in the CSS indicates meaningful worsening of symptoms.
For estimating meaningful change in the CSS for symptom improvement, we used the LCSS-Meso QOL item as the anchor and a no more than ± 9-point change to indicate a no-change group. We stratified participants into 3 groups, this time on the basis of change in the LCSS-Meso QOL item between baseline and Day 1 of Cycle 2: improved (n = 61), no change (n = 70), and declining (n = 52). Both the anchor-based and distribution-based results (Table 3) suggested that a 1-point change in the CSS indicates meaningful improvement in symptoms.
With a 1-point change in the CSS as indicative of meaningful change over time compared with baseline, approximately 47% of patients showed worsening symptoms as measured by the CSS, and approximately 21% showed improvement (Fig. 2).
Discussion
This report describes a method for developing and evaluating a composite symptom score (CSS) to represent disease-related symptom burden as a single variable in clinical trials targeting malignant pleural mesothelioma. The CSS was derived from items of the MDASI-MPM, which was developed and validated in a Phase IIB clinical trial as a component of an ongoing collaborative program12,13. The CSS is a subset of MDASI-MPM symptom items that summarizes change in disease-related symptom burden during and after a trial. Disease-related symptoms are those that are often present before treatment begins, as described for head and neck cancer30 and lung cancer31. The CSS takes 1–5 min to complete. Its limited number of symptoms and simple format make it easy administer electronically from outside of the clinic—for example, as an ePRO on a smartphone. Because the CSS can track change at the individual patient level, it can be used in a responder analysis, providing an additional patient-level perspective on a treatment’s impact on trial participants. The substantial overlap between the CSS and the full set of symptom items in this study supports less-frequent administration of the full scale—possibly only at major milestones, such as for staging, imaging, or protocol-specified clinic visits.
The method used to develop the CSS included input from an external independent review committee with expertise in the clinical application of PROs in clinical trials and regulatory experience. On the basis of sequential analyses of blinded, pooled longitudinal data from the Phase IIB trial, the IRC suggested methods for selecting CSS items from among the symptoms rated by patients as most severe at baseline, during the trial, and at the safety follow-up, when most patients were experiencing disease progression. The recommended final CSS included pain, fatigue (tiredness), shortness of breath, muscle weakness, and coughing. Muscle weakness and coughing emerged from the qualitative interviews of MPM patients15; the remaining CSS items were core MDASI symptoms14.
The psychometric properties of the CSS were examined using pooled, blinded data from the Phase IIB trial. The CSS was found to have good internal consistency and test–retest reliability. Principal component analysis conducted at 7 assessment timepoints strongly supported the unidimensional construct of symptom burden for the CSS. As expected, the CSS was also highly correlated (0.92–0.94) with the full MDASI-MPM symptom set from which it was derived and with the LCSS-Meso (0.79–0.87). The CSS was sensitive in detecting change—in this case, worsening symptom burden between baseline and the safety follow-up due to disease progression.
To the extent possible, regulatory agencies wish to evaluate disease-related symptom change as distinct from symptomatic change due to treatment toxicities1. Accordingly, the FDA recommends that, in PRO data submitted for review, disease-related symptoms should be separated from the toxicities of therapies (symptomatic adverse events)1. This is an active area of discussion in oncology trial design, and some symptoms can be both disease-related and treatment-related. Tracking change in symptom severity and daily functioning from pretreatment baseline to end of trial has been suggested as a way to approach this issue.
In its June 2021 draft guidance4, the FDA recommended that a core set of patient-reported symptoms representing disease and functional impacts as well as symptomatic toxicities be included in cancer clinical trials. They also proposed that disease-related symptoms can be measured individually or within a symptom score with other disease-related symptoms relevant to the cancer under study, and that a multisymptom measure that asks patients to rate each symptom at its worst during a specified time interval could be used.
The CSS is being proposed as an efficacy outcome representing disease-related symptom burden; it is not intended as a measure of symptomatic treatment toxicities. This is in contrast to the National Cancer Institute's Patient-Reported Outcomes Version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) measurement system, whose intended use is to describe the overall safety and tolerability of a therapeutic compound, regimen, or device6. The PRO-CTCAE was created primarily to assess symptomatic treatment toxicities, not to evaluate the potential modification of disease-related symptoms characteristic of a specific patient sample32.
The use of a composite score as a co-primary or secondary endpoint is not without precedent. The FDA approved the inclusion of a composite symptom score in labeling claims for ruxolitinib for the treatment of myelofibrosis10. This process employed a composite “total symptom score” comprising 6 symptoms items mutually agreed upon by the drug sponsor and the FDA as characterizing myelofibrosis11,33. The label presents the percentage of patients who reported a 50% reduction in this total symptom score11,34. The Average Symptom Burden Index (ASBI) from the LCSS-Meso has been used to measure symptom burden as a construct in non-small lung cancer trials and to demonstrate treatment equivalence or, in some cases, superiority in symptom control35,36. The ASBI averages the ratings of 5 symptoms (pain, fatigue, shortness of breath, coughing, and lack of appetite). The LCSS-Meso and the CSS differ in recall periods (24 h for the CSS, 1 week for the LCSS-Meso) and response options (numeric for the CSS, visual analog for the LCSS-Meso).
The derivation of the CSS also parallels the development of the FDA-qualified Non-Small Cell Lung Cancer Symptom Assessment Questionnaire (NSCLC-SAQ)37,38, a collaborative effort by a consortium of pharmaceutical company outcomes experts, academic investigators, and regulatory agencies in accordance with FDA guidance on clinical outcome assessment qualification39. Because we had access to longitudinal Phase IIB trial data, we were able to examine change in the CSS over time.
Of note, the sponsor interacted with our group to develop the trial, including seeking input from investigators who had experience using the target drug in Phase I/II studies to learn how these site investigators expected the drug to affect disease-related symptom burden. Moreover, the sponsor and the academic team presented plans for PRO instrument acceptance as a clinical outcome assessment in meetings with the FDA, per the FDA’s suggestion of early review of instruments and assessment schedules during instrument development.
A major limitation of this study is the rapid progression of disease in our sample, restricting the determination of symptom status associated with disease improvement. The data used to develop the CSS were derived primarily from patients with stage III or IV epithelioid MPM. Using the CSS in trials with earlier-stage disease would require recalibration of the symptom items for patients with less disease burden. Of note, the 2 agents investigated in the Phase IIB trial (anetumab ravtansine versus vinorelbine) exhibited limited clinical activity. A second limitation was the lack of blinding for both patients and trial site investigators. Nonetheless, the PRO academic investigators and the IRC were blinded as to treatment allocation. Finally, MPM is a cancer that has been a focal point of litigation and substantial financial settlements to patients, possibly creating a bias toward inflating symptom burden reporting. In such instances, PRO assessments, including the CSS, may not be used as designed. Because litigation status was missing from the current dataset, we were unable to investigate this possibility.
Conclusion
We developed a CSS for the MDASI-MPM that was derived from both qualitative and quantitative data and refined on the basis of clinical opinion and psychometric analysis. Our procedures were informed by FDA guidance on the development of fit-for-purpose, PRO-based assessment instruments3,4. Convening a blinded independent review committee (the IRC) was an important feature in the development of the CSS, and having access to longitudinal Phase IIB trial data was invaluable for evaluating the psychometric properties of the composite score. This approach can be used to develop multi-item composite scores for other cancers (especially rapidly progressive disease) and treatments.
Various options are emerging for the treatment of relapsed epithelioid MPM—a previously unmet medical need. Using a brief, highly targeted assessment like the CSS may help to differentiate among novel treatments and to identify those that not only show efficacy but also reduce MPM-related symptom burden more effectively, thus leading to better treatment options and better quality of life for patients with MPM.
Data availability
Data are available on request from the corresponding author. Data are not publicly available due to privacy restrictions.
References
Kluetz, P. G. et al. Focusing on core patient-reported outcomes in cancer clinical trials: Symptomatic adverse events, physical function, and disease-related symptoms. Clin. Cancer. Res. 22, 1553–1558. https://doi.org/10.1158/1078-0432.CCR-15-2035 (2016).
Cleeland, C. S. et al. The symptom burden of cancer: Evidence for a core set of cancer-related and treatment-related symptoms from the Eastern Cooperative Oncology Group Symptom Outcomes and Practice Patterns study. Cancer. 119, 4333–4340. https://doi.org/10.1002/cncr.28376 (2013).
US Food and Drug Administration. Guidance for industry. Patient-reported outcome measures: use in medical product development to support labeling claims (2009). https://www.fda.gov/media/77832/download.
US Food and Drug Administration. Core patient-reported outcomes in cancer clinical trials. Guidance for industry (draft). (2021). https://www.fda.gov/media/149994/download.
US Food and Drug Administration. Patient-focused drug development: Collecting comprehensive and representative input. Guidance for industry, Food and Drug Administration staff, and other stakeholders (2018). https://www.fda.gov/media/139088/download.
Kluetz, P. G., Chingos, D. T., Basch, E. M. & Mitchell, S. A. Patient-reported outcomes in cancer clinical trials: Measuring symptomatic adverse events with the National Cancer Institute’s Patient-Reported Outcomes Version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE). Am. Soc. Clin. Oncol. Educ. Book. 35, 67–73. https://doi.org/10.1200/EDBK_159514 (2016).
Gnanasakthy, A., Mordin, M., Evans, E., Doward, L. & DeMuro, C. A review of patient-reported outcome labeling in the United States (2011–2015). Value. Health. 20, 420–429. https://doi.org/10.1016/j.jval.2016.10.006 (2017).
Gnanasakthy, A., Lewis, S., Clark, M., Mordin, M. & DeMuro, C. Potential of patient-reported outcomes as nonprimary endpoints in clinical trials. Health. Qual. Life. Outcomes 11, 83. https://doi.org/10.1186/1477-7525-11-83 (2013).
Cleeland, C. S. et al. ASCPRO Multisymptom Task Force. Recommendations for including multiple symptoms as endpoints in cancer clinical trials: A report from the ASCPRO (Assessing the Symptoms of Cancer Using Patient-Reported Outcomes) Multisymptom Task Force. Cancer. 119, 411–420. https://doi.org/10.1002/cncr.27744 (2013).
Deisseroth, A. et al. U.S. Food and Drug Administration approval: Ruxolitinib for the treatment of patients with intermediate and high-risk myelofibrosis. Clin. Cancer. Res. 18, 3212–3217. https://doi.org/10.1158/1078-0432.CCR-12-0653 (2012).
Emanuel, R. M. et al. Myeloproliferative neoplasm (MPN) symptom assessment form total symptom score: Prospective international assessment of an abbreviated symptom burden scoring system among patients with MPNs. J. Clin. Oncol. 30, 4098–4103. https://doi.org/10.1200/JCO.2012.42.3863 (2012).
Mendoza, T. R. et al. Evaluation of the psychometric properties and minimally important difference of the MD Anderson Symptom Inventory for malignant pleural mesothelioma (MDASI-MPM). J. Patient. Rep. Outcomes. 3, 34. https://doi.org/10.1186/s41687-019-0122-5 (2019).
Kindler, H. L. et al. Anetumab ravtansine versus vinorelbine in patients with relapsed, mesothelin-positive malignant pleural mesothelioma (ARCS-M): A randomised, open-label phase 2 trial. Lancet. Oncol. 23, 540–552. https://doi.org/10.1016/S1470-2045(22)00061-4 (2022).
Cleeland, C. S. et al. Assessing symptom distress in cancer patients: The M.D. Anderson Symptom Inventory. Cancer. 89, 1634–1646. https://doi.org/10.1002/1097-0142(20001001)89:7%3c1634::aid-cncr29%3e3.0.co;2-v (2000).
Williams, L. A. et al. Modification of existing patient-reported outcome measures: Qualitative development of the MD Anderson Symptom Inventory for malignant pleural mesothelioma (MDASI-MPM). Qual. Life. Res. 27, 3229–3241. https://doi.org/10.1007/s11136-018-1982-5 (2018).
Piccinin, C. et al. Recommendations on the use of item libraries for patient-reported outcome measurement in oncology trials: Findings from an international, multidisciplinary working group. Lancet Oncol. 24, e86–e95. https://doi.org/10.1016/S1470-2045(22)00654-4 (2023).
Bayer HealthCare Pharmaceuticals Inc. Integrated Clinical Study Protocol No. BAY 94-9343/15743: A randomized, open-label, active-controlled, Phase II study of intravenous anetumab ravtansine (BAY 94-9343) or vinorelbine in patients with advanced or metastatic malignant pleural mesothelioma overexpressing mesothelin and progressed on first line platinum/pemetrexed-based chemotherapy (2018). https://storage.googleapis.com/ctgl-large-docs/40/NCT02610140/Prot_000.pdf.
Serlin, R. C., Mendoza, T. R., Nakamura, Y., Edwards, K. R. & Cleeland, C. S. When is cancer pain mild, moderate or severe? Grading pain severity by its interference with function. Pain. 61, 277–284. https://doi.org/10.1016/0304-3959(94)00178-h (1995).
Anderson, K. O. Role of cutpoints: Why grade pain intensity?. Pain. 113, 5–6. https://doi.org/10.1016/j.pain.2004.10.024 (2005).
Mendoza, T. R. et al. The rapid assessment of fatigue severity in cancer patients: Use of the Brief Fatigue Inventory. Cancer. 85, 1186–1196. https://doi.org/10.1002/(sici)1097-0142(19990301)85:5%3c1186::aid-cncr24%3e3.0.co;2-n (1999).
Hubbard, J. M., Grothey, A. F., McWilliams, R. R., Buckner, J. C. & Sloan, J. A. Physician perspective on incorporation of oncology patient quality-of-life, fatigue, and pain assessment into clinical practice. J. Oncol. Pract. 10, 248–253. https://doi.org/10.1200/JOP.2013.001276 (2014).
Cleeland, C. S. et al. Pain outcomes in patients with advanced breast cancer and bone metastases: Results from a randomized, double-blind study of denosumab and zoledronic acid. Cancer. 119, 832–838. https://doi.org/10.1002/cncr.27789 (2013).
Shi, Q., Mendoza, T. R. & Cleeland, C. S. Interpreting patient-reported outcome scores for clinical research and practice: Definition, determination, and application of cutpoints. Med. Care. 57, S8–S12. https://doi.org/10.1097/MLR.0000000000001062 (2019).
Hollen, P. J. et al. Measurement of quality of life in patients with lung cancer in multicenter trials of new therapies. Psychometric assessment of the Lung Cancer Symptom Scale. Cancer. 73, 2087–2098. https://doi.org/10.1002/1097-0142(19940415)73:8%3c2087::aid-cncr2820730813%3e3.0.co;2-x (1994).
Hollen, P. J., Gralla, R. J., Liepa, A. M., Symanowski, J. T. & Rusthoven, J. J. Measuring quality of life in patients with pleural mesothelioma using a modified version of the Lung Cancer Symptom Scale (LCSS): Psychometric properties of the LCSS-Meso. Support. Care. Cancer. 14, 11–21. https://doi.org/10.1007/s00520-005-0837-0 (2006).
Hollen, P. J., Gralla, R. J., Liepa, A. M., Symanowski, J. T. & Rusthoven, J. J. Adapting the Lung Cancer Symptom Scale (LCSS) to mesothelioma: Using the LCSS-Meso conceptual model for validation. Cancer. 101, 587–595. https://doi.org/10.1002/cncr.20315 (2004).
Liepa, A. M., Hollen, P. J., Gralla, R. J. & Rusthoven, J. J. Reliability and validity of modified Lung Cancer Symptom Scale (LCSS) in multinational sample of patients with pleural mesothelioma [abstract]. International Society for Quality of Life Research (ISOQOL) 8th Annual Conference, Amsterdam, The Netherlands, Nov 7–10, 2001. Qual. Life. Res. 10, 280. https://doi.org/10.1023/A:1016836728226 (2001).
Oken, M. M. et al. Toxicity and response criteria of the Eastern Cooperative Oncology Group. Am. J. Clin. Oncol. 5, 649–655 (1982).
Hollen, P. J., Gralla, R. J., Kris, M. G., Eberly, S. W. & Cox, C. Normative data and trends in quality of life from the Lung Cancer Symptom Scale (LCSS). Support Care Cancer. 7, 140–148. https://doi.org/10.1007/s005200050244 (1999).
Hanna, E. Y. et al. The symptom burden of treatment-naive patients with head and neck cancer. Cancer. 121, 766–773. https://doi.org/10.1002/cncr.29097 (2015).
Mendoza, T. R. et al. Assessment of baseline symptom burden in treatment-naive patients with lung cancer: An observational study. Support Care Cancer. 27, 3439–3447. https://doi.org/10.1007/s00520-018-4632-0 (2019).
Dueck, A. C. et al. National Cancer Institute PRO-CTCAE Study Group. Validity and reliability of the US National Cancer Institute’s patient-reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE). JAMA Oncol. 1, 1051–1059. https://doi.org/10.1001/jamaoncol.2015.2639 (2015).
Gwaltney, C. et al. Development of a harmonized patient-reported outcome questionnaire to assess myelofibrosis symptoms in clinical trials. Leuk. Res. 59, 26–31. https://doi.org/10.1016/j.leukres.2017.05.012 (2017).
Incyte Corporation. Jakafi (ruxolitinib) tablets—prescribing information (2023). https://www.jakafi.com/pdf/prescribing-information.pdf.
Reck, M. et al. Evaluation of health-related quality of life and symptoms in patients with advanced non-squamous non-small cell lung cancer treated with nivolumab or docetaxel in CheckMate 057. Eur. J. Cancer. 102, 23–30. https://doi.org/10.1016/j.ejca.2018.05.005 (2018).
Reck, M. et al. Impact of nivolumab versus docetaxel on health-related quality of life and symptoms in patients with advanced squamous non-small cell lung cancer: Results from the CheckMate 017 study. J. Thorac. Oncol. 13, 194–204. https://doi.org/10.1016/j.jtho.2017.10.029 (2018).
Bushnell, D. M. et al. Non-Small Cell Lung Cancer Symptom Assessment Questionnaire: Psychometric performance and regulatory qualification of a novel patient-reported symptom measure. Curr. Ther. Res. Clin. Exp. 95, 100642. https://doi.org/10.1016/j.curtheres.2021.100642 (2021).
McCarrier, K. P. et al. Patient-Reported Outcome Consortium & Non-Small Cell Lung Cancer Working Group. Qualitative development and content validity of the Non-small Cell Lung Cancer Symptom Assessment Questionnaire (NSCLC-SAQ), a patient-reported outcome instrument. Clin. Ther. 38, 794–810. https://doi.org/10.1016/j.clinthera.2016.03.012 (2016).
US Food and Drug Administration. Qualification process for drug development tools. Guidance for industry and FDA staff (2020). https://www.fda.gov/media/133511/download.
Acknowledgements
Jeanie F. Woodruff, BS, ELS contributed to the editing of the manuscript.
Funding
Open access funding provided by the National Institutes of Health. This work was supported by a collaborative research agreement between Bayer HealthCare and The University of Texas MD Anderson Cancer Center. Members of Bayer Healthcare were involved in the design and performance of the study and made suggestions in their capacity as authors, but publication of study results was not contingent on Bayer Healthcare’s approval or censorship of the manuscript.
Author information
Authors and Affiliations
Contributions
C.C.: Conceptualization, formal analysis, funding acquisition, investigation, methodology, validation, visualization, writing—original draft, writing—review & editing. K.K.: Conceptualization, funding acquisition, investigation, methodology, visualization, writing—original draft, writing—review & editing. B.C.: Conceptualization, funding acquisition, investigation, methodology, visualization, writing—review & editing. C.E.: Data curation, formal analysis, funding acquisition, investigation, methodology, project administration, writing—review & editing. J.S.: Conceptualization, data curation, formal analysis, investigation, methodology, validation, visualization, writing—original draft, writing—review & editing. C.G.: Conceptualization, data curation, methodology, writing—original draft, writing—review & editing. T.S.: Conceptualization, methodology, project administration, supervision, writing—original draft, writing—review & editing. J.S.: Conceptualization, formal analysis, methodology, writing—original draft, writing—review & editing. A.D.: Conceptualization, formal analysis, methodology, validation, visualization, writing—original draft. A.B.: Conceptualization, methodology, validation, writing—review & editing. X.S.W.: Conceptualization, formal analysis, methodology, writing—original draft, writing—review & editing. L.W.: Conceptualization, methodology, validation, writing—original draft, writing—review & editing. T.M.: Conceptualization, data curation, formal analysis, investigation, methodology, project administration, validation, visualization, writing—original draft, writing—review & editing.
Corresponding author
Ethics declarations
Competing interests
The MD Anderson Symptom Inventory (MDASI) and its derivative versions are copyrighted and licensed by The University of Texas MD Anderson Cancer Center and Charles S. Cleeland. Charles S. Cleeland and Xin Shelley Wang have a financial interest in the MDASI and its derivative versions. Karen N. Keating, Brian Cuffel, Cem Elbi, Jonathan M. Siegel, and Christoph Gerlinger are employees of Bayer and hold stock in Bayer. Charles Cleeland, Xin Shelley Wang, Tito Mendoza, and Loretta Williams have received research funding from Bayer. The authors declare that they have no other competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cleeland, C.S., Keating, K.N., Cuffel, B. et al. Developing a fit-for-purpose composite symptom score as a symptom burden endpoint for clinical trials in patients with malignant pleural mesothelioma. Sci Rep 14, 14839 (2024). https://doi.org/10.1038/s41598-024-62307-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-62307-5
- Springer Nature Limited