Background

Cancer is a leading cause of death worldwide, accounting for nearly 10 million deaths in 2020 alone [1]. The burden of cancer is expected to rise in the coming years, with an estimated 28.4 million new cancer cases predicted by 2040 [1]. Type 2 diabetes (T2D) is also a significant public health concern globally, with an estimated 463 million people living with the condition in 2019 [2]. There is a growing body of evidence that suggests an increased risk of several types of cancer in individuals with T2D compared to the general population [3].

Impaired glucose tolerance (IGT) is a pre-diabetic state in which blood glucose levels are higher than normal but not high enough to be classified as T2D [4]. To date, several studies have examined the association between prediabetes or IGT and the risk of cancer [5] and the risk of cancer death [6,7,8,9] [10] with mixed results. A meta-analysis reported that individuals with prediabetes or IGT had a significantly higher risk of developing cancer compared to those with normal glucose levels, as IGT was associated with an overall 25% increased risk of cancer [11]. Recently In a Chinese general population, it was also found that individuals with IGT had a significantly increased risk of cancer [12].

However, no study has specifically investigated the association between the development of T2D and the risk of cancer in individuals with IGT, investigating the mediating effect of the onset of T2D in the association between IGT and cancer [13] [14]. Therefore, our study aims to fill this gap in the literature by using a novel landmark analysis with tapered matching strategies within a large IGT cohort to investigate the association between the onset of T2D and the risk of common adult cancers in New Zealand (NZ).

Methods

Data setting

The Diabetes Care Support Service (DCSS) was created in 1991 to improve diabetes care in West, East and South Auckland, NZ via general practice audits [18]. The DCSS also collected data on those with IGT. For this study, we identified a cohort of patients aged 18 years and above with IGT by linking the de-identified DCSS database with national cancer and death registration, hospitalization, pharmaceutical claim, and socioeconomic status data.

IGT was diagnosed using the 2-h glucose of 7.8–11 mmol/L on an oral glucose tolerance test (OGTT) [15]. The final dataset included demographics, clinical measurements (smoking, blood pressure (BP), body mass index (BMI), HbA1c, and lipids), and treatment (e.g., antihypertensive, statin, antiplatelet, and/or anticoagulant treatment). We validated the data through internal quality control policies and audits [16,17,18]. To cross-validate the prescription data in DCSS, we used pharmaceutical claims data from 2006 onwards as National Health Index numbers were not universal before then. We included data for all patients from their first DCSS enrolment date until their last enrolment on 31/7/2018. The North Health Ethics Committee approved the DCSS for research purposes in 1992, and then as an ongoing audit in 1996 (92/006). The ethics review was waived on March 25, 2019. We used anonymized data for this analysis, and signed consent was provided by an authorized signatory for each general practice. This manuscript adheres to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.

Exposure

We identified patients with IGT and classified them based on exposure to T2D. Exposure was defined as newly diagnosed T2D recorded in any linked dataset. A landmark analysis was conducted to examine the effect of T2D onset on the risk of cancer. The analysis involved selecting a fixed time after cohort entry to conduct a survival analysis. Only patients with IGT who were alive at the landmark date were included, and T2D onset was based on exposure before the landmark date. Exposure was only evaluated during the exposure window, which was between the index date and the landmark time point. The outcome was then assessed from the landmark time point. Five landmark time points were determined a priori, specifically at 1, 2, 3, 4, and 5 years after the cohort enrolment date. Exposure status was assigned for patients with IGT who were still alive at the landmark dates. The method of landmark analysis was illustrated in Supplemental Fig. 1.

Fig. 1
figure 1

Adjusted hazard ratio for 5- and 10-year risk of cancer at 1-, 2-, 3-, 4-, and 5-year landmark (final tapered matched models)

Outcome

The study assessed incident primary cancers as the outcome of interest. Incident cancer was defined as the first coded case of cancer recorded in the linked datasets, occurring during the follow-up period since the landmark time point, to mitigate potential information bias. Participants with IGT were followed up until an outcome of interest occurred or until December 31, 2019, for those without any outcome of interest. Outcomes were identified using primary International Classification of Diseases, Ninth Revision (ICD-9) and ICD-10 codes.

Covariates

The potential confounding factors, including patient demographic characteristics (age, sex), lifestyle factors (smoking status), clinical measurements BMI, BP, HbA1c, lipids, eGFR), and treatments such as antihypertensive, anticoagulant, and lipid-lowering drugs at baseline, were considered as covariates in the analysis. The NZDep2013 Index of Deprivation, which provides an index of multiple deprivation (IMD) score for each meshblock in NZ based on the distribution of the first principal component scores, was used to define the socioeconomic status of participants [19]. The IMD score ranges from 1 to 10, with lower scores indicating less deprivation. To ensure sufficient statistical power, the IMD was categorized into five groups: IMD-1 (least deprived: NZDep2013 scores of 1–2), IMD-2, IMD-3, and IMD-4 (scores of 3–4, 5–6, and 7–8, respectively), and IMD-5 (most deprived: scores of 9–10). These categories were consistent with prior deprivation measures.

Statistical analysis

We employed tapered matching techniques to address confounding [20]. This approach evaluated the impact of T2D onset on the risks of cancers between focal (exposed: IGT with T2D onset during the exposure time window) and control (IGT without T2D onset during the exposure time window) groups using entropy balancing. This method involved gradually matching the control cohort to the focal cohorts using additional covariates and observing how the matched cohort changed with respect to hazard ratios (HRs) and unmatched covariates.

To minimize model dependence and the possibility of irresolvable imbalances between comparative groups, we used coarsened exact matching (CEM) to limit the comparison of patients in comparative groups to areas of common support before tapered matching and balancing [21, 22]. For each of the five years of landmark analysis, ten matching steps were performed, and patients with IGT in the comparative groups who were matched on the tenth step were retained (Supplemental Fig. 2–6).

Fig. 2
figure 2

Stratified adjusted hazard ratio for 5- and 10-year risk of cancer at 5-year landmark (final tapered matched models)

We restricted the analysis to participants with areas of common support and used entropy balancing to minimize differences in matching variable distribution between comparison groups. Entropy balancing involves the maximum entropy reweighting of the unexposed group by directly incorporating covariate balance into the weight function, in which the matched sample is reweighted in each matching step to achieve key target moments such as mean, variance, and skewness. All pre-processing was conducted without reference to outcomes.

We applied weighted Cox proportional hazards regression, incorporating matching weights estimated from each matching step by entropy balancing, to account for competing risk of all-cause death (except deaths due to incident cancer). The analysis estimated the relative risk of cancer between comparison groups. Missing data were minimal, and multiple imputations with chained equations were performed on six imputed datasets using Robin's rule. Subgroup analysis was also processed by sex, age-group, NZE, deprivation status, smoking status, obesity, levels of clinical measurements (SBP, TC, LDL, and eGFR). Subgroup analyses employed a test of interaction to investigate whether there was evidence indicating a differential impact of T2D onset on the risk of cancer across subgroups. Analyses were conducted using Stata/MP version 17.0 (StataCorp LLC), and statistical significance was set at P < 0.05 (two-tailed).

Results

A total of 26,794 patients with IGT were initially included in the study conducted by DCSS from 1994–2018. Participants with a history of outcomes, death, or loss of follow-up between the enrolment date and the landmark time point were excluded. Through 10 matching steps, matched cohorts of patients with and without the onset of T2D were created for 1-, 2-, 3-, 4-, and 5-year landmark analysis (Supplemental Fig. 2–6). The number of participants in the exposed vs. unexposed group were 112 vs. 1435, 254 vs. 2,872, 385 vs. 3,818, 477 vs. 3,922, and 511 vs. 3,336 controls, respectively.

Supplemental Table 1 and Table 1 display the characteristics of individuals with IGT with and without the onset of T2D before and after matching. After tapered matching, particularly entropy matching, no significant differences were detected in the variables included in the matching process between patients with IGT with and without the onset of T2D for all landmark analyses (Table 1), indicating successful matching.

Table 1 Comparison of patients with and without the onset of type 2 diabetes among patients with impaired glucose tolerance in the final entropy balancing matched cohorts

Table 2 presents the results of the landmark analyses, which show that the 5-year risk of cancer decreased over time for both the exposure and non-exposure groups. In the exposure group, the 5-year risk decreased from 22.92 (95% confidence interval: 11.44–41.00) per 1,000 person-years at the 1-year landmark analysis to 3.19 (1.38–6.29) at the 5-year landmark analysis. Similarly, in the non-exposure group, the 5-year risk decreased from 10.55 (8.10–13.49) to 2.56 (1.84–3.46) over the same period.

Table 2 5-year and 10-year rates of any cancer among patients compared between people with impaired glucose tolerance with and without the onset of type 2 diabetes after coarsened and exact matching for 1-5 year landmark analysis

For the 10-year risk of cancer, the results were similar, with both the exposure and non-exposure groups showing a decreasing risk over time. Specifically, the 10-year risk in the exposure group decreased from 21.24 (11.31–36.32) per 1,000 person-years at the 1-year landmark analysis to 7.51 (4.95–10.91) at the 5-year landmark analysis. In the non-exposure group, the 10-year risk decreased from 11.51 (9.16–14.26) to 4.46 (3.61–5.44) over the same period.

After final step-10 matching, the final adjusted hazard ratios (HRs) for the 5-year risk of cancer comparing individuals with and without the onset of T2D decreased over the length of the landmark periods from 1.51 (0.68–3.32) at the 1-year landmark analysis to 1.16 (0.61–2.20) at the 5-year landmark analysis (see Fig. 1 and Supplemental Fig. 7). For the 10-year risk of cancer, the final adjusted HRs comparing individuals with and without the onset of T2D increased over the length of the landmark periods from 1.21 (0.76–1.92) at the 1-year landmark analysis to 1.35 (1.09–1.68) at the 5-year landmark analysis (see Fig. 1 and Supplemental Fig. 8).

In the stratified 5-year landmark analysis, the significant association between onset of T2D and risk of cancer was not found (see Fig. 2). For the 10-year cancer risk, the adjusted HR was significantly higher patients aged less than 57 years, those with non-NZE ethnicity, the most deprived, smokers, obese individuals, those with higher SBP and HbA1c levels, higher TC, and lower eGFR (see Fig. 2).

Discussion

The present study investigated the association between the development of T2D within 1 to 5 years and the 5- and 10-year risk of cancer using landmark analysis in a population with impaired glucose tolerance in Auckland, NZ. The results showed increased final adjusted hazard ratios (HRs) for the 10-year risk of cancer comparing individuals with and without the onset of T2D increased from 1.21 at the 1-year landmark analysis to 1.35 at the 5-year landmark analysis. Our stratified analysis further highlights the importance of cancer prevention measures in the IGT population, such as smoking cessation, lifestyle modification, and screening strategies, especially in individuals who are male, younger than 57 years, from non-NZE ethnicities, and are more deprived. These findings have important implications for the management of IGT patients and emphasize the need for targeted cancer prevention interventions in this population.

Previous studies have investigated the association between IGT and the risk of cancer in the Swedish general population, and the association between IGT in the Japanese general population [10], the Mauritius general population [9], the Finnish general population [8], the Italian general population [6], and the US general population [7], with varied estimations. A recent meta-analysis showed that individuals with IGT have a higher risk of cancer compared to those with normal glycaemic status (RR: 1.25, 95% CI: 1.02–1.53) [11]. However, few studies have investigated the association between the newly development of T2D and the risk of cancer among individuals with IGT. The present study is the first study to investigate this association and suggests that individuals with IGT who develop T2D have a significantly higher risk of cancer, particularly over the long-term risk (10-year risk), even after accounting for common confounding factors and the competing risk of death from other causes (excluding cancer-related death). Our findings suggest the onset of T2D may play a role in the development of cancer in the population with IGT.

A potential interpretation of the observed increased risk among individuals with IGT who developed T2D could be that T2D and cancer may share common risk factors, such as obesity, socioeconomic deprivation, physical inactivity, and poor diet, which may increase the risk of both conditions [3] [23]. In the current study, BMI and socioeconomic deprivation have been largely ruled out by the tapered matching models. Although ethanol consumption is strongly associated with socioeconomic status and smoking status, which could serve as surrogates for ethanol consumption and were balanced in the analysis, it is important to note that there is a lack of information in the dataset regarding ethanol consumption, which is a major factor linked to other unadjusted factors in carcinogenesis. Data relating to physical activity and diet/nutrition exposure were not available in the current study, which would need further studies to test their impact on the association. Another possible explanation is that chronic hyperglycaemia and insulin resistance, which are significant characteristics among individuals with IGT who develop T2D, may promote cancer development by altering cellular metabolism and promoting the proliferation of cancer cells [24,25,26]. High levels of glucose and insulin can increase the activity of various growth factors, such as insulin-like growth factor-1 (IGF-1), which can stimulate the growth and survival of cancer cells [27, 28]. In addition, chronic inflammation, observed in the development of T2D, may also play a role in cancer development by promoting DNA damage and impairing the immune response to cancer cells [29, 30].

In the current study, we observed different trends in the adjusted HRs for the 5-year and 10-year cancer risks associated with the onset of T2D. The 5-year risk of cancer showed a decrease over the length of the landmark periods, while the 10-year risk of cancer showed an increase. One possible explanation for these differences is the varying impact of confounding factors over different follow-up durations. Shorter follow-up periods (e.g., 5 years) might not fully capture the long-term biological and environmental influences on cancer development associated with T2D. In contrast, longer follow-up periods (e.g., 10 years) may allow for the cumulative effects of T2D-related metabolic changes, such as chronic hyperglycemia and insulin resistance, which can promote carcinogenesis over time. Moreover, the lack of statistical significance in the 5-year risk compared to the 10-year risk may be due to the relatively shorter follow-up duration, which limits the number of cancer cases observed within this period. This could lead to wider confidence intervals and lower statistical power to detect significant differences. In contrast, the 10-year follow-up period provides a longer observation window, potentially increasing the number of incident cancer cases and thus enhancing the statistical power to detect significant associations. Future research should aim to collect detailed longitudinal data on these lifestyle factors and investigate their long-term interactions with T2D and cancer risk. This could help illustrate the mechanisms underlying the observed temporal differences and improve our understanding of cancer risk dynamics in individuals with T2D.

The main aim of the current study is to understand the association between the onset of T2D and the cancer risk within specific time-windows (5-year and 10-year in the current study), rather than time-to-event. The latter is beyond the scope of this study and could be affected by the recorded event time in the system, potentially introducing information bias. Future studies with accurate records of outcome time are warranted to predict the time to cancer following the development of T2D in the IGT population.

The findings of this study have significant clinical and public health implications. First, our results suggest that individuals with IGT who develop T2D have an increased risk of developing cancer, especially in the first 10 years following the onset of diabetes. Therefore, healthcare providers should be aware of this increased risk and take steps to screen for cancer and provide appropriate counselling to their patients [31]. Additionally, interventions to prevent the onset of T2D may also help to prevent the development of cancer in the IGT population [3]. From a public health perspective, our findings highlight the need for targeted screening strategies for cancer in individuals with IGT, especially those who develop T2D. This is particularly important for individuals from deprived or minority ethnic backgrounds, who were found to be at increased risk of developing cancer in our study. In addition to screening, efforts to promote lifestyle changes, such as smoking cessation and increasing physical activity, may also help to reduce the risk of cancer in this population [32, 33]. By identifying individuals at increased risk of cancer and providing appropriate screening and counselling, healthcare providers and public health officials can take steps to reduce the burden of cancer in this population [34].

Identifying immortal bias is more challenging than confounding effects due to indication. For instance, the association between T2D onset and cancer risk in individuals with IGT exhibited a differential effect when comparing the survival of patients with and without T2D onset. By replacing the original index date with the T2D onset time, the focal population survived from the index date to the T2D onset date, whereas the matched control population could have still been in the early stage of the index date when following up with the onset time as the index date. Although the index date was matched between patients with and without incident T2D, it did not ensure that the time from the index date to the date of T2D onset was comparable between the two groups. Patients who developed T2D were still more prone to having a false survival advantage (being alive at the future cancer diagnosis) because they had to survive until the onset of T2D to be assigned as cases. Hence, without addressing immortal bias by design (e.g., through the use of a landmark analysis), a biased estimate would unavoidably occur.

Our study possesses multiple strengths. Firstly, it is the largest multi-ethnic cohort with IGT studied in NZ and one of the largest globally that investigates the correlation between the onset of T2D and 5- and 10-year cancer risk. These cohorts encompassed all patients from the participating general practices and linked to large, nationally representative databases to prospectively follow patients and record all new cancer cases. The accuracy of clinical recording and diagnoses was validated for outcomes defined by ICD codes, which exhibit high precision. Secondly, the utilization of landmark analysis within 1- to 5-year time frames offered a robust methodology to eliminate immortal bias. Another key strength was the use of an innovative, tapered matching technique to create "quasi-trial" comparison cohorts between patients with IGT with and without an onset of T2Dto compare the risk of 5- and 10-year cancer risk and assess how distinct confounding factors contributed to the risk of cancer. Despite the numerous strengths, limitations exist, including the lack of national representation of the sample and participating general practices in NZ. Lastly, information regarding certain risk factors for cancer such as dietary information, physical activity, and genetic variants was not accessible, and future studies should take these risk factors into account. Future studies with randomised controlled trials or more detailed observational data could further validate our results.

The fundamental challenge of pooling all cancer types lies in the variability of risk factors specific to each type of cancer. While it is well-known that risk factors can vary widely between different cancer types, subdividing the outcomes into exact cancer types would result in very few cases per category, thereby diminishing the statistical power and efficiency of the analysis. The primary aim of this study was to explore the association between the onset of T2D and the overall risk of cancer. By considering the development of T2D as a common risk factor for multiple cancers, we provide a broader perspective that can inform population-level screening strategies and public health interventions. Established findings from our study highlight the importance of implementing preventative measures, such as additional emphasis on smoking cessation programs in primary care, specifically within the IGT population to mitigate the increased cancer risk associated with the onset of T2D. Future studies should consider more detailed causal inference methods to explore specific cancer types once sufficient data are available.

Conclusion

In conclusion, our study contributes new evidence to the association between the onset of T2D and cancer risk in individuals with IGT, emphasizing the need for increased awareness and targeted interventions to reduce this risk. Clinicians and public health practitioners should consider incorporating lifestyle modifications as part of cancer prevention and management strategies for individuals with IGT. Furthermore, future research should focus on confirming these findings and exploring the underlying mechanisms of the observed associations.