Background

Lung cancer is the leading cause of cancer-related mortality worldwide, accounting for approximately 18% of all cancer-related deaths [1]. About 80% of lung cancers are non-small cell lung cancer (NSCLC) [2]. Radiation therapy combined with chemotherapy and immunotherapy is one of the most effective treatments for locally advanced NSCLC [3, 4]. In radiotherapy, the doses to organs at risk (OARs) must be reduced to limit radiation-induced toxicities while delivering a conformal dose to the tumor. Locally advanced NSCLC is particularly challenging to treat because of the proximity of the tumor to various OARs, including the lungs, esophagus, heart, and spinal cord.

Proton therapy offers a distinct advantage with its superior dose coverage to the tumor, concurrently minimizing radiation exposure to normal tissues both proximal and distal to the tumor. This advantage stems from the pronounced dose falloff attributed to the Bragg peak phenomenon. Proton therapy can potentially improve the therapeutic ratio in patients with locally advanced NSCLC compared to conventional photon-based radiotherapy [5, 6]. Moreover, proton therapy with pencil beam scanning technology, which allows the use of a narrow proton beam (beamlet) and modulates the weight of individual beamlets, outperforms conventional passive-scattering proton therapy in terms of the conformality of the dose distribution and clinical outcomes [7,8,9].

However, the beamlet's sharp dose falloff and high flexibility render proton therapy more sensitive to the beam range, patient setup uncertainties, and intra/inter-fractional anatomical changes. These uncertainties severely degrade the quality of the dose distribution during proton therapy [10, 11]. Therefore, in lung cancer treatment, proton therapy remains controversial because of the interference between dynamic beamlet delivery and breathing motion (interplay effect), which results in the misplacement of individual beamlets relative to the planned positions, resulting in under dosage to the target and overdosage to normal tissues [12,13,14].

Consequently, several clinical approaches have been proposed to mitigate the effects of breathing motion, including breath-holding, gating, tumor tracking, re-scanning, four-dimensional (4D) treatment planning [15], and robust optimization [16]. To quantitatively evaluate the effectiveness of the delivered dose in the presence of breathing motion, the 4D accumulated dose calculated using 4D-CT images and deformable image registration (DIR) is frequently used [15]. However, despite being well accepted in clinical practice, this method is inadequate for estimating the impact of interplay effects because the estimation of the interplay effects requires additional consideration of the relationship between the time dependence of the delivery system and the breathing motion of the patient, namely 4D dynamic accumulated dose (4DDD). Several researchers have investigated the influence of the interplay effect with 4DDD calculated using in-house developed software or RayStation (RaySearch Laboratories, Stockholm, Sweden) Python scripting because the 4DDD tool is not available for most commercial treatment planning systems (TPS) [17, 18].

Recently, VQA TPS (Hitachi, Ltd., Tokyo, Japan) introduced a new tool to calculate the 4DDD with a user-specified breathing motion set for proton therapy. This 4DDD tool conveniently enables detailed and realistic evaluations of the interplay effect compared to existing tools. Furthermore, few studies have investigated the interplay between proton therapy with gating and 2D layered re-scanning techniques [19,20,21,22]. Therefore, in this study, to provide clinical insights for the treatment decision-making of patients with locally advanced NSCLC undergoing proton therapy, we compared the dosimetric characteristics and robustness of two types of proton therapy planning techniques, single field uniform dose (SFUD) and robustly optimized intensity-modulated proton therapy (IMPT), with breathing motion mitigation techniques for the interplay effect using the 4DDD tool.

Methods

Patient data

A 4D-lung dataset from the publicly available "The Cancer Imaging Archive" [23, 24] was used in this retrospective study. This dataset consisted of the 4D-CT images of 20 patients with locally advanced NSCLC treated with concurrent chemoradiotherapy. 4D-CT was performed using a 16-slice CT scanner (Brilliance Big Bore, Philips Medical Systems, Andover, MA, USA) with 10 breathing phases (CT0–90%, where CT0% represents the end-of-inspiration phase and CT50% represents the end-of-expiration phase) and a slice thickness of 3 mm. A physician delineated the Gross Tumor Volume (GTV) on all 4D-CT images before registration, as documented in The Cancer Imaging Archive. After importing all datasets into MIM Maestro version 7.2.10 (MIM Software Inc., Cleveland, OH, USA), OARs such as the lungs, heart, esophagus, and spinal cord were delineated on CT50%. The breathing amplitude of the tumor was defined as the 3D maximum distance between the GTV centroids for ten 4D-CT phases. Five enrolled patients were excluded because of a lack of tumor motion (< 3 mm). Subsequent treatment planning was implemented for the remaining 15 patients. Table 1 summarizes the tumor characteristics and breathing amplitudes. The GTV volume on CT50% exhibited a range of 6.2 to 162.3 cc, with a median value of 34.7 cc. The tumor amplitude was between 3.0 and 12.0 mm, with a median amplitude of 5.8 mm.

Table 1 Patient characteristics: tumor location, target volumes, tumor amplitude (maximum distance between GTVs at two phases), distance between GTV and OARs

Proton therapy planning

Treatment planning was performed using the VQA TPS commissioned for the proton beam delivery system, PROBEAT-M1 (Hitachi Ltd., Tokyo, Japan), which provided spot scanning with a synchrotron. The beam energy was within the 70.2–228.7 MeV range, corresponding to a penetration value of 4–32 g/cm2 and a spot size of 7.0–2.5 mm in the air at the isocenter. The maximum field size was 30 × 40 cm2 at the isocenter.

For each patient, three-field coplanar spot-scanning proton therapy plans with respiratory gating and re-scanning techniques were created at CT50%. Three gantry angles with 45° spacing between each beam were selected based on the patient's anatomy. A short path length to the target was preferred while avoiding the contralateral lung and spinal cord [25].

The gating technique involved synchronized radiation delivery with the patient's breathing signals only within a pre-defined gating window [26, 27]. In this study, the gating windows were defined as the number of phases around the end-of-expiration phase: CT40–60% phases (3-phase) and CT30–70% phases (5-phase). Accordingly, two internal GTVs (iGTV), the union of CT40–60% phase GTVs or CT30–70% phase GTVs, were created as target structures, and the volumes are listed in Table 1. The relative ratios of iGTV volume to GTV ranged from 1.03–1.41 (median: 1.15) and 1.12–1.52 (median: 1.30) for CT40–60% and CT30–70%, respectively. The lymph nodes were not included as targets in this study. The re-scanning technique is the most straightforward method for mitigating the interplay effects in proton therapy. It involves dividing the dose at every spot by the number of re-scans [19, 28]. In this study, layered re-scanning was applied, in which each isoenergy layer was individually re-scanned, and the number of re-scans was uniformly set to four.

The treatment plans were optimized using the single-field uniform dose (SFUD) and IMPT techniques. In SFUD, each field delivers a uniform dose distribution to the target volume. A beam-specific planning target volume was created using a 5-mm lateral iGTV expansion perpendicular to the beam axis, and the distal and proximal margins in the beam direction were determined considering a 3.5% proton range uncertainty [29]. Doses in the IMPT plans were optimized for each field. Robust optimization according to the worst-case minimax approach, with a 5-mm setup uncertainty in the left–right, superior-inferior, and anterior–posterior directions (in a total of six scenarios) and a 3.5% range uncertainty, was used for the IMPT [25].

The prescribed dose was 70 Gy relative to the biological effectiveness (GyRBE; RBE, 1.1) in 35 fractions. The iGTV D95% in the nominal scenario was normalized to the prescribed doses in all the IMPT plans. Notably, no specific constraints were used for the OARs in this study. A pencil beam algorithm with a triple Gaussian kernel model [30] and a grid resolution of 2 × 2 × 2 mm3 was used for the dose calculations, and the resulting dose was termed the static dose (SD). The triple Gaussian kernel model comprising a primary component (multiple Coulomb scattering) and a halo component (inelastic, nonelastic, and elastic nuclear reactions) enhanced the precision of dose calculations in spot scanning techniques. A previous study demonstrated that the model reproduced Monte Carlo simulation results well in complex geometries, such as the lungs [30].

Simulation of the interplay effect

The 4DDD tool, which determines the positional relationship of the delivered spot on each breathing phase CT and calculates the dose distributions for every breathing phase based on the SD plan, was used to calculate the 4DDD, considering the interplay effect.

The time structure of the spot delivery was defined using machine- and patient-specific parameters. The machine-specific parameters were as follows: acceleration and deceleration times of the protons, approximately 1.0 s; preparation time for irradiation, 70 ms; maximum spill length, 4.4 s; maximum flattop length, 5.0 s; and maximum waiting time for the gating signal, 100 ms in the synchrotron operation. The moving time between the spots and spot delivery time depends on the irradiated position and monitor unit (MU), with maximum and minimum spot weights of 0.0388 and 0.0052 MU, respectively, in the scanning beam nozzle operation. Figure 1 shows a schematic of the proton machine operation. The breathing cycle was set to 4.0 s as a patient-specific parameter.

Fig. 1
figure 1

Time-dependent beam spot delivery pattern of the proton therapy system. This figure shows the synchrotron operation pattern and the irradiation timing according to the patient's breathing pattern. Beam spots were irradiated when the gating signal was switched on. The rest and move are statically determined times occurring before and after irradiation and the moving time to the next spot delivery position, respectively

Furthermore, before starting this study, the impact of the starting phase on 4DDD was investigated by randomly varying the starting phase in every fraction (i.e., the fraction effect) in the three patients with the largest breathing amplitude. The dose differences were limited when comparing the results with varying starting phases to those with fixed starting phases. These data are provided in the Supplementary Material (Table S1). Consequently, the starting phase of each field irradiation was fixed at the CT 0% of the 10 breathing phases in this study.

The 4DDD tool in VQA enables the spot MU to be separated and assigned to the corresponding and subsequent phases according to the time structure to reduce the dose calculation errors caused by the discrete movement of the target between phases, as previously studied [31, 32]. The dose distribution was calculated after each spot was assigned to a breathing-phase CT scan. Subsequently, each phase-specific dose distribution was accumulated to CT50% from the other 4D-CTs using deformation vector fields created by the DIR in MIM to obtain the 4DDD [33, 34]. This study used an intensity-based freeform deformation algorithm with default parameters for DIR before 4D dose accumulation. Previous studies have validated and described the DIR performance of MIM [35, 36].

Evaluation and statistical analysis

To quantify the plan quality, the dose volume histogram (DVH) indices, including the near-minimum dose D98% [GyRBE] (dose at 98% of the target volume), D95% [GyRBE] (dose at 95% of the target volume), near-maximum dose D2% [GyRBE] (dose at 2% of the target volume), target coverage V98% (volume receiving 98% of the prescription dose), and V100% (volume receiving 100% of the prescription dose), was calculated for each iGTV (3-phase and 5-phase) on the SD and 4DDD in both the SFUD and robust IMPT plans.

The target homogeneity index (HI) [37] and conformation number (CN) [38] were calculated using Eqs. (1) and (2).

$$\text{HI}=\frac{{\text{D}}_{2\%}-{\text{D}}_{98\%}}{{\text{D}}_{50\%}},$$
(1)

where D50% is the dose at 50% of the target volume, while

$$\text{CN}=\frac{{\text{TV}}_{\text{RI}}}{\text{TV}}\times \frac{{\text{TV}}_{\text{RI}}}{{\text{V}}_{\text{RI}}},$$
(2)

where TV is the target volume, TVRI is the target volume covered by the 95% isodose, and VRI is the volume covered by the 95% isodose.

Values close to zero and one represent ideal values for HI and CN, respectively.

For the OARs, the following parameters relevant to toxicity were evaluated: Dmean, V5GyRBE, and V20GyRBE for the lungs; Dmean and Dmax for the heart and esophagus; and Dmax for the spinal cord. Only the patients' data with each OAR (heart, esophagus, and spinal cord) within 2 cm of the target volume were used for a more relevant dosimetric evaluation of OAR doses [39].

The differences in the DVH indices (\(\Delta \text{I}\)) between SD and 4DDD were calculated using Eq. (3) to evaluate the interplay effect [18].

$$\Delta \text{I}\left(\%\right)=\frac{4\text{DDD}\left[\text{DVH}\right]-\text{SD}\left[\text{DVH}\right]}{\text{SD}\left[\text{DVH}\right]}\times 100\%,$$
(3)

where \(4\text{DDD} [\text{DVH}]\) and \(\text{SD} [\text{DVH}]\) are the specific DVH indices for the region of interest in the 4DDD and SD distributions, respectively.

Paired Student's t-tests were used to calculate the two-tailed p values between the SD and 4DDD, SFUD, and IMPT plans using MATLAB 2020b (MathWorks Inc., Natick, MA, USA). Differences in the evaluation index for each plan were considered statistically significant at a p value of < 0.05.

Results

Plan quality with or without interplay effect

Figure 2 shows representative dose distributions for the 3-phase SFUD and IMPT plans. It shows the dose distribution for CT40%, CT50%, CT60%, and 4DDD. In the modeling of 4DDD calculation algorithm, the MU of a spot delivered at a given time is divided into adjacent-phase CTs with weighted factors considering the remaining breathing times of each phase. Due to the specification, the dose at the center phase CT (CT50%) tended to increase than the other phases. In contrast, the dose at the CT60% (the end of gating) decreased because the delivery spots may be less than the other phases owing to the combination of the synchrotron operation pattern and the breathing motion pattern. Figure 3 shows a comparison between SD and 4DDD for the 5-phase plans.

Fig. 2
figure 2

Representative dose distributions. Each figure presents dose distribution on CT40%, CT50%, CT60%, and 4DDD for a patient's 3-phase SFUD and IMPT plans (P101). The red line shows the GTV on each phase CT

Fig. 3
figure 3

Representative SD and 4DDD in a patient's 5-phase SFUD and IMPT plans (P101). The red line shows GTV on the CT50%

The upper panel of Table 2 lists the mean values and standard deviations of the DVH indices for the SD and 4DDD for the 3-phase SFUD and IMPT plans. The p values show the results of the statistical tests for the SD and 4DDD plans. In both techniques, 4DDD had significantly lower D95%, D98%, and V100% and higher HI than SD. In contrast, 4DDD had a higher CN in SFUD and a lower V98% in IMPT than in SD. The lower panel in Table 2 lists the mean values and standard deviations of the DVH indices for SD and 4DDD and the p values for the 5-phase SFUD and IMPT plans. In both techniques, 4DDD had significantly lower D95%, D98%, V98%, and V100% and higher D2% and HI than SD. In addition, 4DDD had higher CN content in SFUD than in SD. For the OARs in both SFUD and IMPT, there was a statistically significant difference only between SD and 4DDD in esophageal Dmean.

Table 2 DVH indices for SD and 4DDD in SFUD and IMPT

Evaluation of plan robustness against interplay effect

Table 3 lists the differences in the DVH indices (ΔI) for the 3- and 5-phase SFUD and IMPT plans. The p values show the results of the statistical tests for the SFUD and IMPT plans. In the 3-phase SFUD plans, the deterioration rates were significantly higher for V100%, HI, and CN, with smaller absolute deviations for V98% than for the IMPT plans. Significantly higher deterioration was observed for V100% and HI in the 5-phase SFUD than in the 5-phase IMPT. Notably, in both the 3- and 5-phase plans, there was no statistically significant difference in the deterioration rates of the OAR indices between SFUD and IMPT.

Table 3 Relative differences in the DVH indices between SD and 4DDD (ΔI) for SFUD and IMPT

Discussion

This study investigated the dosimetric characteristics and robustness of two types of proton therapy planning techniques—SFUD and robustly optimized IMPT—with gating and re-scanning techniques for the interplay effect using the 4DDD tool. First, the dose differences between SD and 4DDD were statistically compared for the SFUD and robustly optimized IMPT plans with 3-phase gating. The significance test between SD and 4DDD demonstrated that some evaluation indices for the target were significantly affected by the interplay effect in the SFUD and robustly optimized IMPT plans, even though mitigation techniques combining 3-phase gating and re-scanning were used. However, the mean differences in the indices were within the range of 0.1–0.2 GyRBE for D2%, D95%, and D98%, and 0.0% and 0.1% for V98% in the SFUD and IMPT plans, respectively. Furthermore, both the 4DDD plans met the criteria of V98% and V100% > 95%, which are considered acceptable in clinical practice. These results indicate that the interplay effect may not always be clinically relevant.

As previously reported, the interplay effect does not significantly affect the accumulated dose distribution over multiple fractions by averaging inhomogeneous doses in the target despite causing degradation of the single-fraction or stereotactic body proton therapy dose distributions [12, 40,41,42,43,44]. In addition, the effectiveness of proton spot scanning, which combines gating and re-scanning for moving lung tumors, has been demonstrated in previous studies [19,20,21,22, 45]. Grassberger et al. reported that a gating technique with a 30% window is sufficient to reduce the interplay effect [19]. However, when the gating window was extended to 50%, both 4DDD plans failed to account for the interplay effect, resulting in unacceptable target coverage. The V98% value remained at almost 100%, whereas the V100% value decreased to 89.3% and 94.0% for the SFUD and IMPT plans, respectively.

This study also examined which optimization technique can compensate for the interplay effect by evaluating the relative differences between SD and 4DDD (ΔI). IMPT plans employing robust optimization demonstrated enhanced capability in preserving target coverage with the prescription dose, particularly under a broader gating window. This superiority is evident through significantly higher robustness for V100% compared to SFUD in the 3- and 5-phase plans. However, the deterioration rate for HI was lower in IMPT; nevertheless, the HI values for 4DDD were much higher in SFUD. In particular, a previous study [46] highlighted the importance of achieving homogeneous dose distribution within the target volume.

Similarly, the effectiveness of robust optimization for proton lung treatment has been reported in several studies [25, 47, 48]. SFUD can maintain sufficient robustness against interplay effects in some patients. Therefore, a dosimetric investigation is necessary to evaluate which optimization technique is more effective in patients with locally advanced NSCLS. More attention should be paid to determining the ideal optimization technique for clinical practice.

Regarding the OAR doses, there were no statistically significant differences between the SD and 4DDD groups, except for the esophagus in the 5-phase plans; this might be because the esophagus is highly mobile and often resides close to the target. [49] Thus, the interplay effects might have increased the dose. In addition, we considered that since the mean and maximum absolute dose differences were less than 0.2 GyRBE and 0.9 GyRBE for both techniques, the OAR doses in the SFUD and IMPT plans might not be significantly affected by the interplay effect. Moreover, when comparing the OAR doses of both techniques, the IMPT plan could considerably reduce the doses because robust optimization tends to shrink the irradiation volume more than the conservative beam-specific margin concept, consistent with several previous studies [49,50,51]. It is worth noting that the dose constraints for the OARs were not applied in the optimization process in this study.

This study had some limitations. First, the patient-wise optimal combination of the gating window and re-scanning number, which are directly involved in treatment time, was not investigated, even though the magnitude of the interplay effect is patient-specific [52, 53]. Thus, increasing the window could improve the time efficiency while maintaining clinically acceptable results if a patient has a small breathing amplitude. Second, although more specific treatment parameters may be identified using amplitude-binning techniques [54] instead of the phase binning of 4D-CT used in the current study, the evaluation of plan robustness was performed only for the interplay effect and not for other uncertainties, such as setup error, phase shift, and anatomical changes. In treatments employing gating and re-scanning techniques, these uncertainties may significantly influence dose distribution more than the interplay effect. Therefore, a comprehensive evaluation is essential to determine the robust treatment technique for such dosimetric uncertainties combined with the interplay effect. Third, the consequences of 4DDD computations depend on the input of the patient- and machine-derived parameters into the de novo 4DDD tool. This study set the patient's breathing period at a fixed value of 4.0 s; however, the simulation validity could change with the change in the breathing motion patterns. Thus, the patient-specific breathing period obtained from 4D-CT acquisition should be used to simulate the 4DDD in clinical practice. Finally, this study utilized only a single 4D-CT image set to calculate the 4DDD, which cannot investigate the effects of interfractional changes, interfractional and intercyclic breathing pattern variations, or irregular breathing. Moreover, the non-Monte Carlo dose calculation algorithm, DIR within the 4D dose accumulation, and the 4DDD tool without considering the beam-on/off latency may introduce some uncertainty to the results. Dose evaluations considering these effects are necessary for a more precise assessment of the plan's robustness against uncertainties.

Conclusions

This study investigated the interplay of two types of proton therapy planning techniques with gating and re-scanning for locally advanced NSCLC using the newly developed 4DDD tool introduced by the VQA TPS. The interplay effects had a limited impact on both techniques with a gating window of three phases; however, the target coverage was considerably jeopardized by increasing the gating window size. Furthermore, this study demonstrates that a robustly optimized IMPT can better compensate for the interplay effect with respect to target coverage and homogeneity than the SFUD.