Introduction

Diffuse large B-cell lymphoma (DLBCL) is the most common type of non-Hodgkin lymphoma (NHL) in Western countries, accounting for about one-third of new cases [1,2,3,4]. It predominantly affects the elderly, with a median age at diagnosis of 70 years [5], and is notable for its morphological, molecular, and clinical heterogeneity [6]. This heterogeneity significantly influences survival outcomes, which vary according to several well-established prognostic indices [7,8,9,10,11].

The introduction of the R-CHOP regimen–i.e., combining rituximab with cyclophosphamide, doxorubicin, vincristine, and prednisone–in the early 2000s has led to a substantial increase in relative survival, irrespective of age and disease stage [5, 12,13,14,15,16]. Currently, R-CHOP is the preferred first-line treatment approach for most patients with DLBCL [17]. However, other treatment approaches might outperform R-CHOP for patients with MYC rearrangement, double- or triple hit lymphoma, an International Prognostic Index (IPI) of 3 to 5, and those with the activated B-cell subtype [18,19,20]. In some countries, but not in the Netherlands, a modified regimen of R-CHOP (pola-R-CHP), in which vincristine was replaced with polatuzumab vedotin, is used in patients with intermediate-risk or high-risk DLBCL [20].

During the last two decades, adjustments to this treatment regime have been made including reduction of the dose intensity (i.e., from every two weeks to every three weeks) [21,22,23,24,25] and the number of cycles (e.g., from eight to six cycles of R-CHOP) [21, 23, 26, 27], as well as dose-reduced R-CHOP (i.e., R-miniCHOP) for elderly, often frail patients [28]. These advancements have been accomplished based on data from clinical trials and population-based studies. As a result, six cycles of R-CHOP administered every 21 days (6x R-CHOP21) is considered the standard of care for most patients with advanced-stage DLBCL in the Netherlands [23, 27, 29,30,31,32].

It remains unclear whether two additional cycles of rituximab (2 R) should be applied after 6x R-CHOP21 (6x R-CHOP21 + 2 R). The MInT trial showed that 6x R-CHOP21 was very effective, although this trial only included patients 18-60 years with none or one risk factor as per the age-adjusted IPI, a population that does not represent the DLBCL population at large [24]. Furthermore, the PETAL trial, published in 2018, showed that 6x R-CHOP + 2 R, compared to 6x R-CHOP, does not significantly improve event-free survival (EFS) and overall survival (OS) in patients with a negative interim positron emission tomography (PET) after two initial cycles of R-CHOP [33]. As of 2021, based on these findings, the Dutch treatment guidelines for medically fit DLBCL patients with stage II-IV disease recommend 6x R-CHOP21 for patients with a negative interim PET and 6x R-CHOP21 + 2 R for those with a positive interim PET [34].

Since there is no randomized comparison between 6x R-CHOP21 and 6x R-CHOP21 + 2 R without the information provided by interim PET scans, this nationwide, population-based study aimed to assess the comparative effectiveness of these two treatment options in patients diagnosed with advanced-stage DLBCL in the Netherlands in an era where interim PET-guided treatment decisions were not standard practice. This study allows us to explore the effectiveness of these treatment options in a real-world setting, thereby offering a unique perspective on managing advanced-stage DLBCL across different risk profiles as per the IPI.

Methods

Data source

This study utilized data from the nationwide Netherlands Cancer Registry (NCR), established in 1989 and managed by the Netherlands Comprehensive Cancer Organisation (IKNL). The NCR covers over 95% of all newly diagnosed malignancies in the Netherlands [35]. It compiles incident cases reported by all Dutch pathology laboratories through the Nationwide Network and Registry of Histopathology and Cytopathology and the National Registry of Hospital Discharges, the latter documenting inpatient and outpatient discharges. After case notification, specialized registrars of the NCR extracted basic data elements through retrospective medical records review within 9 to 12 months following a patient’s diagnosis, which included dates of birth and diagnosis, sex, disease stage, topography and morphology of tumors, primary therapy, and the diagnosing and treating hospital. Tumor topography and morphology were coded according to the International Classification of Diseases for Oncology (ICD-O) standards. Information on patients’ vital status (i.e., alive, dead, or emigration) was obtained via annual linkage with the Nationwide Population Registries Network, which holds this information on all residents in the Netherlands.

In an effort to enrich the NCR, incident cases of all hematological malignancies diagnosed from January 1, 2014, are recorded in the NCR with additional and more detailed information, such as the specific type of primary therapy a patient received and the best response to therapy, of which the latter is based on the physician’s assessment and medical judgment within their clinical practice. However, follow-up of treatment beyond 9 to 12 months after diagnosis among patients diagnosed as of 2014 is not standardly ascertained in the NCR. Therefore, for the current study, trained registrars of the NCR revisited the sites for additional follow-up activities through retrospective medical records review. As a result, we have a median follow-up of 4 years to estimate EFS after first-line treatment.

According to the Central Committee on Research involving Human Subjects, this type of observational, noninterventional study does not require approval from an ethics committee in the Netherlands. The Privacy Review Board of the NCR approved using anonymous data for this study.

Study population

This study included a cohort from the NCR comprising adult ( ≥ 18 years) patients diagnosed with DLBCL without primary central nervous system involvement between January 1, 2014, and December 31, 2018. We identified DLBCL patients using specific topography and morphology codes of the ICD-O, as previously described [5]. We excluded cases with DLBCL diagnoses transformed from an indolent NHL and those with disease stage I or unknown disease stage. Eligibility was restricted to patients who received either 6x R-CHOP21 or 6x R-CHOP21 + 2 R. The selection time frame of our study cohort aligns with the availability of comprehensive details on prognostic factors, treatment regimens, and disease trajectories (e.g., progression) in the NCR. We determined the median follow-up time by accounting for censoring using reverse Kaplan-Meier survival curves.

Outcomes

We evaluated the effectiveness of the two treatment regimens through the endpoints EFS and OS. EFS was measured from the end of treatment (EOT) until the occurrence of progression, relapse, initiation of second-line treatment, the end of follow-up, or death, whichever occurs first. OS was measured from EOT to all-cause death or the end of follow-up. To align with the objective of this study of comparing effectiveness of the two treatment regimens for eligible patients, EOT was defined at a landmark for both treatment groups at 42 days after completion of 6x R-CHOP21 (Supplemental Fig. 1) [36, 37]. To achieve this, patients in both treatment groups experiencing events within 42 days after completing 6x R-CHOP21 were excluded. We performed a sensitivity analysis for a landmark set to 90 days after completion of 6x R-CHOP21.

Propensity score model

Given the observational nature of this study and the inherent non-randomized treatment assignment, each patient has a different probability of receiving treatment (i.e., propensity score). Therefore, we weighted patients using inverse propensity scores to mitigate confounding by balancing patient characteristics across treatment groups. Conditional on the propensity score, treatment assignment was assumed to be random; that is, independent of patient characteristics [38]. We used stabilized propensity score weights to adjust for extreme weights due to propensity scores that were close to zero [39]. We used multivariable logistic regression to model the probability of receiving two additional cycles of rituximab after 6x R-CHOP21 according to the following characteristics at diagnosis: sex, the individual parameters of the IPI (i.e., age, Ann Arbor stage, serum lactate dehydrogenase (LDH), Eastern Cooperative Oncology Group performance status, and extranodal involvement), prior malignancy diagnosis, region of treatment, treatment at an academic center, and socioeconomic status (SES). SES was estimated by ranking neighborhoods using the aggregated level value of houses and household income and was categorized into low (decile 1-3), medium (decile 4–6), or high (8-10). We quantified the association between the probability of receiving 6x R-CHOP21 or 6x R-CHOP21 + 2 R and the previously mentioned characteristics using odds ratios (OR) and their corresponding 95% confidence interval (CI). Missing values of patient characteristics were imputed using 50 imputations [40].

We assessed the overlap in propensity weights distributions, i.e., whether there is sufficient representation of individuals across all levels of the treatment variable, to ensure that the treatment effects can be estimated [41]. We judged adjustment for imbalance of patient characteristics between the two treatment groups successful if the standardized mean difference of each characteristic included in the propensity score model was below 0.1.

Measures of comparative effectiveness

EFS and OS were visualized using Kaplan-Meier curves that display the survival probability at each time point. To ensure optimal 5-year estimates, we restricted the follow-up to five years, censoring any events occurring more than five years after the end of treatment. Since patient characteristics can differ systematically between the two treatments, the Kaplan-Meier curves were weighted using stabilized inverse propensity weights. To formally evaluate overall treatment effectiveness, we used the log-rank test of the difference in the weighted Kaplan-Meier curves stratified by treatment.

To quantify the relative treatment effect, we used a univariable Cox proportional hazards model weighted with stabilized inverse propensity weights to calculate a hazard ratio (HR) of treatment with 6x R-CHOP21 + 2 R versus 6x R-CHOP21 for EFS and OS.

Treatment effects should be measured on the absolute scale to support clinical decision-making [42]. The absolute risk difference (ARD)–i.e., the difference between the weighted Kaplan-Meier estimates of the two treatments at a certain time point–was used to measure the heterogeneity of the absolute treatment effect. Because, the ARD heavily depends on the time point chosen, especially when the hazards in the treatment groups are not proportional (i.e., the HR varies over time or the Kaplan-Meier curves even cross). We also used the difference in restricted mean survival time (ΔRMST) to quantify absolute treatment effect [43]. The ΔRMST represents the difference in life expectancy between the two treatment regimens and was calculated by the area under the treatment-stratified Kaplan-Meier curves weighted using stabilized inverse propensity scores. For example, if the ΔRMST for treatment A versus B was 0.5 years in a 5-year time horizon, a patient treated with treatment A lived an additional six months compared to the patient treated with treatment B during the initial five years of follow-up.

Risk-stratified analysis

The overall treatment effect on the absolute scale may not apply to all patients [42, 44]. Therefore, we assessed the effects of the treatment regimens on EFS and OS between across IPI risk groups [45]. We assessed relative treatment effect heterogeneity by the interaction between treatment assignment and IPI in a Cox proportional hazards model.

All statistical analyses were performed using R statistical software version 4.3.1, and the code was made available at https://github.com/CHMMaas/PaperDLBCL. Statistical significance was defined as p-values below 0.05.

Results

Patient characteristics

In total, 7058 adult patients with untreated DLBCL were diagnosed in the Netherlands between January 1, 2014, and December 31, 2018 (Fig. 1). Patients excluded due to not receiving therapy or other treatment regimens (except 6x R-CHOP21, 6x R-CHOP21 + 2 R, or 8x R-CHOP21) generally had worse physical conditions and achieved a much lower complete remission rate (Supplemental Table 1) compared to included patients. Patients excluded for receiving 8x R-CHOP21 had similar characteristics to included patients (Supplemental Table 1). After applying our inclusion criteria, we included 1577 (22%) patients, of which 672 (43%) patients were treated with 6x R-CHOP21 and 905 (57%) patients were treated with 6x R-CHOP21 + 2 R (Table 1). The median EFS time was 4.44 (IQR: 3.84–5.32) years, and the median OS time was 4.44 (IQR: 3.84–5.29) years for all patients.

Fig. 1: Flow chart of patient selection.
figure 1

*Among the 3162 patients that were excluded because they did not receive 6x R-CHOP21 nor 6x R-CHOP21 + 2 R, 657 patients did not receive therapy, 1159 patients received 8x R-CHOP21, and 1346 received other types of therapy (Supplemental Table 1 provides more information on the characteristics of these patients). DLBCL, diffuse large B-cell lymphoma; NCR, Netherlands Cancer Registry; EBV Epstein-Barr Virus; 6x R-CHOP21, 6 cycles of rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone administered every 21 days; 6x R-CHOP21 + 2 R, 6 cycles of rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone administered every 21 days and two subsequent cycles of rituximab.

Table 1 Demographic and clinical characteristics of the included patients at baseline.

Our analysis cohort mainly comprised of males (56%), with a median age of 71 years (interquartile range [IQR]: 63–77 years; Table 1). The majority of patients (87%) achieved complete remission at EOT, with no significant differences between treatment groups (p-value = 0.66; Table 1). Patients treated with 6x R-CHOP21 + 2 R compared to 6x R-CHOP 21 were older (85% versus 74% older than 60 years), exhibited a worse disease stage, showed a higher prevalence of elevated serum LDH (56% versus 51%), present a higher prevalence of at least one extranodal site (33% versus 26%), and regional difference in treatment allocation were noted (Table 1). The characteristics of sex, performance status, prevalence of at least one prior malignancy, treated at an academic center, and SES were comparable between the two treatment regimens (Table 1).

Propensity score

Treatment assignment was strongly associated with age, disease stage, and treatment region (Table S1). There was sufficient overlap in the propensity weights distribution, demonstrating that all individuals are well-represented in both treatment groups (Fig. S1). A post-weighting balance assessment revealed that the standardized mean difference for all included covariates was reduced to below 0.1, demonstrating successful adjustment for baseline covariate imbalances between the two treatment groups (Fig. S2).

Event-free survival

EFS did not significantly differ between patients treated with 6x R-CHOP21 + 2 R and those receiving 6x R-CHOP21 (p-value of weighted log-rank test = 0.23; HR = 0.89, 95% CI, 0.72–1.09; Fig. 2A). The 5-year ARD was 4.2% (95% CI, −3.6%–11.9%) and recipients of 6x R-CHOP21 + 2 R were expected to have events 0.14 years later than those receiving 6x R-CHOP21 (95% CI, −0.04–0.33) over a 5-year period (Fig. 2A).

Fig. 2: Survival curves and risk tables stratified by treatment (6x R-CHOP21 versus 6x R-CHOP21 + 2R) for patients with advanced-stage diffuse large B-cell lymphoma in the Netherlands.
figure 2

Panel A shows event-free survival and panel B illustrates overall survival. The Kaplan-Meier curves and log-rank test were weighed using stabilized inverse propensity scores, but the risk tables below display the crude number of patients at risk at each time point. Stabilized inverse propensity score weights were averaged over the 50 imputations. EFS, event-free survival; OS, overall survival; HR, hazard ratio; ARD, absolute risk difference; ΔRMST, difference in restricted mean survival time; 95% CI, 95% confidence interval; -2R, 6x R-CHOP21; +2R, 6x R-CHOP21 + 2 R.

EFS at 5 years was lower for patients with higher IPI, regardless of the treatment regimen used (Fig. 3A). The relative treatment effect varied across different IPI risk categories, but not significantly (p = 0.26; Fig. 3B). The addition of two extra cycles of rituximab was associated with larger absolute benefits in EFS when IPI was higher (Fig. 3C, D). The largest effect from additional rituximab cycles was observed in patients with an IPI of 4-5, where the 5-year ARD was 16.8% (95% CI, -0.4%–34.1%; Fig. 3C) and the ΔRMST was 0.47 years (95% CI, 0.05–0.90; Fig. 3D) over a 5-year period.

Fig. 3: Stratified analysis of event-free survival according to the risk group based on the International Prognostic Index.
figure 3

The graphs show (A) event rates using Kaplan-Meier curves weighted with stabilized inverse propensity scores and risk tables below display the crude number of patients at risk at each time point, (B) hazard ratios with a p-value resulting from testing the interaction between IPI risk and treatment regimen, (C) absolute risk differences (ARD), and (D) differences in restricted mean survival time (ΔRMST) for 6x R-CHOP21 and 6x R-CHOP21 + 2 R for EFS. Overall results are depicted by the horizontal dotted line; 6x R-CHOP21 + 2 R showed effectiveness in preventing events. IPI scores and stabilized inverse propensity score weights were averaged over the 50 imputations. N, sample size; EFS, event-free survival; ARD, absolute risk difference, ΔRMST, difference in restricted mean survival time; IPI, International Prognostic Index.

Overall survival

OS did not significantly differ between patients treated with 6x R-CHOP21 + 2 R and those receiving 6x R-CHOP21 (p = 0.53, HR = 0.93 (95% CI, 0.73–1.18); Fig. 2B). OS was comparable between the treatment regimens, the 5-year ARD was 1.3% (95% CI, −6.3%–9.0%) and patients who received two additional cycles of rituximab potentially lived 0.11 years longer (95% CI, −0.05–0.27) over a 5-year period compared to those who did not.

OS at 5 years was lower among patients with higher IPI, irrespective of the treatment regimen received (Fig. 4A). The relative treatment effect varied across different IPI risk categories, but not significantly (p-value = 0.21; Fig. 4B). The absolute benefit of adding two cycles of rituximab increased with higher IPI (Fig. 4C, D). For patients with an IPI of 4-5, the addition of two cycles of rituximab was associated with a notable increase in survival (5-year ARD: 12.1%, 95% CI, −5.4%–29.6% and 5-year ΔRMST: 0.27 years, 95% CI, −0.12–0.67; Fig. 4C, D). The above results were similar when performing the analysis with a 90-day landmark (results not shown here).

Fig. 4: Stratified analysis of overall survival according to the risk group based on the International Prognostic Index.
figure 4

The graphs show (A) event rates using Kaplan-Meier curves weighted with stabilized inverse propensity scores and risk tables below display the crude number of patients at risk at each time point, (B) hazard ratios with a p-value resulting from testing the interaction between IPI risk and treatment regimen, (C) absolute risk differences (ARD), and (D) differences in restricted mean survival time (ΔRMST) for 6x R-CHOP21 and 6x R-CHOP21 + 2 R for OS. Overall results are depicted by the horizontal dotted line; 6x R-CHOP21 + 2 R showed effectiveness in preventing all-cause deaths. IPI scores and stabilized inverse propensity score weights were averaged over the 50 imputations. N, sample size; OS, overall survival; ARD, absolute risk difference, ΔRMST, difference in restricted mean survival time; IPI, International Prognostic Index.

Discussion

This propensity-weighted analysis using nationwide, population-based cancer registry data from the Netherlands did not find significant differences in EFS and OS between patients with advanced-stage DLBCL treated with 6x R-CHOP21 or 6x R-CHOP21 + 2 R. However, our findings suggest improved overall survival outcomes in high-risk patients (i.e., scores of 4-5) treated with 6x R-CHOP21 + 2 R. Collectively, our population-based study, conducted in the absence of interim PET scan treatment guidance, provides a unique insight into the real-world effectiveness of these treatment options across different risk stratifications, thus enhancing our understanding of DLBCL management in the pre-interim PET era and how these findings can be used in an era where treatment decisions can be guided using interim PET scans.

The PETAL trial demonstrated that treatment intensification did not significantly improve EFS and OS for patients with a negative interim-PET scan [33]. Although the PETAL trial did not address the administration of 6x R-CHOP + 2 R to interim PET-positive patients, it highlights the ongoing debate on treatment intensification strategies in this subgroup [33]. Based on a broader consensus among Dutch hematologists due to the lack of definite data, the 2021 Dutch treatment guidelines for DLBCL recommend 6x R-CHOP21 for interim PET-negative patients and 6x R-CHOP21 + 2 R for interim PET-positive patients. Notably, our analysis, which examines the period before the 2021 Dutch guidelines were implemented, revealed considerable regional variations in treatment practices in the Netherlands when adding two additional cycles of rituximab after 6x R-CHOP21. Future studies could investigate whether integrating interim PET scan results leads to more consistent treatment practices across different regions in the Netherlands.

Our findings hint towards a potential benefit for those with high-risk (i.e., scores of 4-5), likely due to disease aggressiveness. This observation potentially aligns with recent insights from Wang et al., who explored the biological mechanisms of DLBCL aggressiveness across different IPI [6]. Their comprehensive analysis demonstrated distinct molecular and microenvironmental profiles, particularly in high-risk categories, which may explain the variability in treatment responses and survival outcomes. Specifically, they identified that MCD- and ST2-like subtypes and alterations in the lymphoma microenvironment were associated with higher IPI and poorer clinical outcomes, thereby supporting the rationale for intensified treatment in these patients. However, applying interim PET scans could potentially modify these outcomes by guiding the intensified treatment. In contrast, a randomized phase III trial of the HOVON and the Nordic Lymphoma Group (HOVON-84) showed that early rituximab intensification during R-CHOP does not improve outcomes in patients with untreated DLBCL, irrespective of IPI score [46]. Therefore, further research is needed to validate our findings and to explore whether incorporating molecular markers, as identified by Wang et al., into the treatment decision-making process could enhance survival outcomes in high-risk patients (i.e., scores of 4-5). Unfortunately, due to the lack of standardized registration of histological and molecular subtypes during our study period, we could not assess the impact of these or other subtypes (e.g., cell of origin and double- or triple-hit lymphoma). Lastly, there seems to be limited benefit in survival outcomes in patients with low IPI scores (i.e., scores 0-3) treated with 6x R-CHOP21 + 2 R. Therefore, when deciding to treat low-risk patients with two additional cycles of rituximab, it is crucial to carefully evaluate the potential cost increase and treatment-related side effects. Moreover, considering that treatment intensification may yield only minimal improvements in quality of life, a balanced assessment of benefits and burdens is needed when choosing more extended treatment regimens.

The strength of this study is the use of a comprehensive, long-running nationwide cancer registry, which provided extensive data on treatment and patient characteristics. Additionally, this is the first study to systematically compare 6x R-CHOP21 and 6x R-CHOP21 + 2 R in the absence of a randomized comparison not guided by interim PET scans, employing various methodologies to assess treatment effectiveness using observational data. The clinical implications of our findings are substantial, offering valuable insights for both patients and clinicians, with potential to influence policy.

Our study also has limitations that warrant caution in interpretation. First, the inclusion of patients was affected by setting the landmark, resulting in the investigation of a patient population with relatively better prognosis. This approach aligned with our study’s goal of assessing effectiveness between the two regimens, because if a patient’s health is insufficient to withstand two additional cycles of rituximab, there is no need for deliberation between the treatment regimens of 6x R-CHOP21 and 6x R-CHOP21 + 2 R. Additionally, we defined the landmark to be exactly 42 days after completing 6x R-CHOP21, but there was a possible discrepancy between the actual duration of chemoimmunotherapy cycles and the standard 21-day cycle. Second, despite efforts—using landmark analysis and propensity score weighting—to mitigate the influence of unmeasured confounding variables (e.g., lack of information on double- or triple-hit lymphomas across most of the registry) and account for the impact of short-term events, these could not be completely eliminated [37]. For example, a slight divergence in Kaplan-Meier survival curves for high-IPI (i.e., scores of 4-5) patients shortly after treatment completion suggests that outcomes may be influenced by factors not recorded in the NCR, such as comorbidities, toxicities, early versus late responders, and other local treatment practices that extend beyond regional treatment practices. We attempted to account for comorbidities and regional treatment practices by including previous malignancies, SES, and an indicator for treatment region in the propensity score model; nevertheless, we could not fully capture its complete extent. Furthermore, we needed to impute the performance status for a substantial number of patients. Collectively, these limitations highlight the need for cautious interpretation of our findings and validation through further research.

In conclusion, our study provides valuable insights into the treatment outcomes of patients with advanced-stage DLBCL in the era before the routine use of interim PET scans. While no significant differences were observed in EFS and OS between patients treated with 6x R-CHOP21 and those treated with 6x R-CHOP21 + 2 R, there was an indication that patients with high IPI might benefit more from 6x R-CHOP21 + 2 R. These findings underscore the potential for augmented treatment approaches in DLBCL, particularly for those with a higher prognostic risk. However, given the limitations related to unmeasured confounders, future population-based research should focus on validating our study findings in the context of interim PET-guided treatment, which could potentially enhance the precision of therapeutic strategies and improve outcomes for patients with DLBCL.