Introduction

Colorectal cancer (CRC) is the second most fatal cancer in the United States, with more than 50,000 reported deaths in 2024, following lung and bronchus cancer. An estimated 150,000 new cases have been diagnosed correspondingly, with a slight shift towards males compared to females1. Increased colorectal cancer screening and improved treatment strategies in the United States have contributed to a decline in overall incidence and mortality rates; however, a worrisome increase in the incidence of early-onset colorectal cancer (EOCRC), defined as CRC diagnosis in patients aged < 50 years, has been observed despite the declining of the overall incidence of CRC2,3,4. EOCRC exhibits more restricted pathological features and symptoms in comparison to later-onset colorectal cancer (LOCRC). Accordingly, patients are less likely to seek immediate medical care5. To our knowledge, the current upward trend has not been fully accounted for by various driving environmental and genetic risk factors with limited understanding of the precise epidemiology of EOCRC. However, the current belief is that the epidemiological profile of EOCRC patients is distinct from that of LOCRC patients6. Owing to the rising incidence of EOCRC, a surge in research investigating the clinical outcomes and prognostic factors of EOCRC was recently observed. Previous studies have suggested potential sex-based differences in survival outcomes among CRC patients7. However, the findings of these studies were often conflicting possibly due to their loose inclusion criteria that hindered drawing specific conclusions regarding this distinct population of CRC patients. Our study aims to investigate sex-specific differences in survival among EOCRC patients, and separately compare sex-specific predictors of survival in both males and females in the United States using population-level data.

Methods

Study design

The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) standards were followed in conducting our retrospective, observational cohort study8.

Data source

This investigation was carried out using the Surveillance, Epidemiology, and End Results (SEER) program9. We retrieved the data of cancer patients from the SEER 17 registries dataset, which covers roughly 26.5% of the US population using the SEER*Stat software (version 8.4.3; https://seer.cancer.gov/seerstat/)10. Because this study uses anonymized SEER data, permission from an institutional review board was not required.

Population selection

Patients’ eligibility was determined based on the following inclusion criteria: (1) Patients diagnosed with colorectal cancer (identified using the “site and morphology” recode based on the ICD-0–3/WHO 2008 definitions) (2) patients with microscopically confirmed malignant CRC (3) patients with a known age at diagnosis between the age of 20 and 49 (4) patients diagnosed between 2000 and 2017 (available SEER staging data). The exclusion criteria were as follows: (1) patients with incomplete survival times (2) patients with an unknown cause of death (3) patients identified through death certificate or autopsy only. The flowchart illustrating the selection of EOCRC patients is depicted in Fig. 1.

Fig. 1
figure 1

Flowchart of the patients screening process.

Study variable

The main examined variable in this study was the patients’ sex classified by the SEER database as either male or female. Additionally, we collected the following variables for each patient included in our analysis: age at diagnosis, race, marital status at diagnosis, median household income, residential area, primary site of cancer, tumor histology (according to the SEER histology broad groupings variable groupings into adenocarcinomas (codes 8140–8389), cystic and mucinous tumors (codes 8440–8499), and the rest of tumor histology collectively as “Others”), and grade, cancer stage according to the SEER stage summary recode, and types of treatment modalities received. To avoid possible selection bias, all patients diagnosed with EOCRC who fit the inclusion criteria were included in the study regardless of the presence of missing baseline variables.

Study outcomes

The outcomes of interest for this study were Overall survival (OS) defined as the duration between the time of CRC diagnosis and death of any cause or the date of last follow-up, cancer-specific survival (CSS) defined as the duration between the time of CRC diagnosis, and death attributed to the index CRC, and noncancer-specific survival (NCSS) defined as the duration between the time of CRC diagnosis and death attributed to other causes than the patients’ index cancer. We determined the causes of death using the SEER’s provided vital status recode, cause-specific, and other cause-of-death classifications.

Statistical analysis

Categorical variables of males and females were described as frequencies and percentages, then compared using Pearson’s chi-squared test. We used Kaplan–Meier survival curves to investigate the crude survival of different patient cohorts. Differences in survival were compared using the log-rank test. Furthermore, univariable and multivariable survival analyses using Cox proportional hazards models were used to calculate the hazard ratios (HRs) and their respective 95% confidence intervals (CI), characterizing the impact of different variables on survival outcomes. Additionally, we used propensity score matching (PSM) to match male and female patients using the different baseline variables, eliminating possible biases and enhancing the robustness of the study’s results. The matched cohort was then examined using the same survival analysis pipeline. Survival analyses were carried out using Jamovi Software (version 2.3.28; https://www.jamovi.org/) while PSM analysis was conducted using the R software (version 3.6.3; https://cran.r-project.org/). All analyses were two-tailed and a statistical level of 0.05 was considered significant.

Results

Baseline characteristics

We retrieved and analyzed the data of 58,667 patients diagnosed with EOCRC from the SEER dataset. Out of the total cohort, 27,662 were females and 31,005 were males. We observed significant differences in the distribution of the two groups across all of the included baseline characteristics. Notably, males were diagnosed at an older age compared to females, and significant differences in race between the two groups were also observed. Males were more likely to be diagnosed with rectal cancer (42.2% vs 36.7%) while females had a higher percentage of left-sided colon disease compared to males (31.1% vs 26.6%). Additionally, differences in cancer stages were also observed with more males diagnosed with regional disease (40% vs 37.2%) and more females diagnosed with localized disease (34.8% vs 32%). Males were more likely to receive radiotherapy and chemotherapy, and less likely to have surgery compared to their female counterparts. Table 1 summarizes the baseline characteristics of patients with EOCRC.

Table 1 Baseline characteristics of patient diagnosed with early-onset colorectal cancer (EOCRC) from the SEER database.

Differences in survival between males and females

The 1-year, 3-year, and 5-year OS probabilities for males were 89.1%,73.8%, and 65.7%, respectively. While The 1-year, 3-year, and 5-year OS probability for females were 91.3%, 76.9%, and 69.9%, respectively. On the other hand, the 1-year, 3-year, and 5-year CSS probability were 89.9%,75.3%, and 67.9% for males, and 91.9%,78%, and 71.5% for females. In terms of NCSS, the 1-year, 3-year, and 5-year survival for males were 99.14%,98%, and 96.79%, and 99.41%, 98.59%, and 97.78% for females, respectively. Kaplan–Meier survival estimates, and log-rank tests demonstrated worse survival of males compared to females in regards to OS, CS, and NCSS (P < 0.0001) (Fig. 2). In the multivariable Cox regression model, male sex was associated with worse OS (HR = 1.19, 95% CI 1.16–1.22, P < 0.001), CSS (HR = 1.15, 95% CI 1.11–1.28, P < 0.001), and NCSS (HR = 1.56, 95% CI 1.44–1.69, P < 0.001) (Table 2 and Supplementary Table 1).

Fig. 2
figure 2

Kaplan–Meier curves of the general SEER cohort. Overall survival (A), cancer-specific survival (B), and noncancer-specific survival (C) in female, and male early-onset colorectal cancer patients.

Table 2 Cox regression models for sex in the general cohort and the post-propensity score matching cohort.

Using PSM, we matched males and females in a 1:1 ratio to control for the possible biases introduced by the study’s retrospective design. The PSM-matched cohort included 26,103 patient pairs without demonstrating significant differences between the two groups (Supplementary Table 2). The survival of the matched cohort closely mirrored the general cohort with males showing worse outcomes (Fig. 3). Compared to females, male sex continued to be associated with worse OS (HR = 1.18, 95% CI 1.15–1.21, P < 0.001), CSS (HR = 1.14, 95% CI 1.1–1.17, P < 0.001), and NCSS (HR = 1.58, 95% CI 1.45–1.71, P < 0.001) (Table 2 and Supplementary Table 3). These results confirm the previous conclusions of the general cohort.

Fig. 3
figure 3

Kaplan–Meier curves of the post-PSM cohort. Overall survival (A), cancer-specific survival (B), and noncancer-specific survival (C) in female, and male early-onset colorectal cancer patients.

We conducted a subgroup analysis of all baseline variables stratifications comparing the survival of males and females with subsequent adjustment using multivariable models. Regarding OS, males had a statistically significant worse survival compared to females in all subgroups except for the American Indian/Alaska Native subgroup (P = 0.43), the < $35k median household incomesubgroup (P = 0.779), and the transverse colon subgroup (P = 0.67). Males also had a statistically significant worse CSS compared to females in all subgroups except for the Asian or Pacific Islander subgroup (P = 0.138), the American Indian/Alaska Native subgroup (P = 0.714), the < $35k median household income subgroup (P = 0.92), the transverse colon subgroup (P = 0.682), and the grade IV tumor subgroup (P = 0.056). Additionally, male sex was associated with worse NCSS in all subgroups except for the 20–29 age subgroup (P = 0.863), the American Indian/Alaska Native subgroup (P = 0.219), the < $35k median household income subgroup (P = 0.409), the transverse colon subgroup(P = 0.054), the Cystic/mucinous histology subgroup (P = 0.106), and the grades III and IV tumors subgroups (P = 0.087 and P = 0.197, respectively) (Supplementary Table 4).

Predictors of survival in males and females

The statistically significant factors affecting OS of male EOCRC in multivariable analysis were older age at diagnosis (40–49 years old vs 20–29 years old, HR = 1.18, P < 0.001), black race (vs white, HR = 1.26, P < 0.001), single and unmarried status (vs married, HR = 1.34, P < 0.001, and HR = 1.36, P < 0.001, respectively), a higher median income (> $75k vs < $35k, HR = 0.76, P = 0.001), tumor grades II, III, and IV (vs grade I, HR = 1.4, P < 0.001, HR = 2.18, P < 0.001, and HR = 2.5, P < 0.001, respectively), Cystic/mucinous histology (vs adenocarcinoma, HR = 1.24, P < 0.001), regional, and distant stages (vs localized stage, HR = 2.42, P < 0.001, and HR = 10.98, P < 0.001, respectively), and receiving radiotherapy, chemotherapy, and surgery (vs No/unknown status, HR = 1.09, P < 0.001, HR = 0.9, P < 0.001, and HR = 0.4, P < 0.001, respectively) (Table 3).

Factor significantly affecting CSS in males generally followed the same pattern for predictors of OS except for age which was not associated with CSS (40–49 years old vs 20–29 years old, HR = 1.05, P = 0.253). Factors affecting NCSS were also comparable to OS variables except for some distinctions. A higher median income, regional stage, and tumor histology did not affect NCSS (> $75k vs < $35k, P = 0.119, regional vs localized, P = 0.123, and Cystic/mucinous vs adenocarcinoma, P = 0.783, respectively). Moreover, rectal cancers and only tumor grade IV were associated with NCSS (vs right-sided colon cancer, HR = 0.78, P = 0.002, and vs grade I, HR = 1.65, P = 0.017, respectively) (Table 3).

Table 3 Cox regression models for survival in male early-onset colorectal cancer (EOCRC) patients.

In females, the statistically significant factors affecting OS were older age at diagnosis (40–49 years old vs 20–29 years old, HR = 1.13, P < 0.001), black race, and Asian or pacific islander (vs white, HR = 1.32, P < 0.001, and HR = 1.09, P = 0.013 respectively), single and unmarried status (vs married, HR = 1.29, P < 0.001, and HR = 1.23, P < 0.001, respectively), a higher median income ($50k—$75k vs < $35k, HR = 0.82, P = 0.036, and > $75k vs < $35k, HR = 0.77, P = 0.006), residence (nonmetropolitan vs metropolitan, HR = 1.15, P < 0.001), patients with transverse colon, and rectal tumors (vs right-sided colon tumors, HR = 1.1, P = 0.041, and HR = 0.9, P = 0.001, respectively), tumor grades II, II, and IV (vs grade I, HR = 1.63, P < 0.001, HR = 2.59, P < 0.001, and HR = 2.72, P < 0.001, respectively), Cystic/mucinous histology (vs adenocarcinoma, HR = 1.15, P < 0.001), regional, and distant stages (vs localized stage, HR = 3.04, P < 0.001, and HR = 15.11, P < 0.001, respectively), and receiving radiotherapy, and surgery (vs No/unknown status, HR = 1.09, P < 0.001, and HR = 0.4, P < 0.001, respectively) (Table 4).

Factors significantly affecting CSS in females generally followed the same pattern for predictors of OS except for age at diagnosis, and median income which were not associated with CSS. Compared to the OS factors, NCSS was not related to area of residence, tumor grade (except for grade III), and radiotherapy. However, chemotherapy was associated with better NCSS (vs No/unknown status, HR = 0.62, P < 0.001) (Table 4).

Table 4 Cox regression models for survival in female early-onset colorectal cancer (EOCRC) patients.

Discussion

There is controversial evidence on sex differences in survival rates of CRC. Some studies suggested that females have superior survival rates than males11,12,13,14. However, others did not show significant survival differences15,16. Different studies assessed sex-related differences in prognosis and survival among older and later-onset CRC patients14,17,18,19, yet the current research focuses on investigating sex differences in survival among EOCRC and sex-specific predictors of survival in both sexes in the recent years in the United states. We found that men with EOCRC have worse survival outcomes than women. Kaplan–Meier survival and log-rank test showed worse survival of males when compared to females, in terms of OS, CSS, and NCSS (p < 0.0001). Additionally, multivariable Cox regression models confirmed the association between male sex and worse survival outcomes. Moreover, the analysis of matched cohorts showed similar results. These findings were consistent across most subgroups and cancer stages.

Majek et.al conducted a population-based analysis of 164,996 CRC patients in Germany. They found that age-adjusted 5-year survival was longer in females than in males 64.5% vs 61.9%, p < 0.0001). Notably, the survival advantage in women was highest in patients < 45 years old. In a multivariable analysis, women continued to show a survival advantage over men, even after adjusting for CRC stage and subsite, in patients < 56-year-old, but not in older patients14. Similarly, Yang et. al conducted a meta-analysis of studies reporting survival differences between male and female sexes among CRC patients. It showed that females had significantly longer OS (HR: 0.87; 95% CI 0.85–0,89) and CSS (HR: 0.92; 95% CI 0.89–0.95) than males18. Interestingly, our study revealed that male sex was adversely associated with OS, CSS, and NCSS. Therefore, we suggest that the bad prognosis is not only cancer-related, but may also be non-cancer-related. These findings are supported by the work of Samawi et al. who revealed a comparable conclusion in early-stage CRC patients concerning the OS. In a multivariable analysis, men had worse OS (HR: 1.38; 95% CI 1:15–1.64) and recurrence-free survival (HR: 1.40; 95% CI 1.18–1.67), compared to women. On the contrary, when researchers overlooked non-cancer causes of death, CRC outcomes appeared similar in both genders. Additionally, they did not find sex-related significant differences regarding the CSS19.

On the contrary, a cross-sectional study conducted in the UK revealed that males had slightly better 1-year survival than females but the 5-year survival appeared similar between both sexes17. Although we found consistently worse OS, CSS, and NCSS for male sex across all cancer stages, White et al. showed inconsistent results. They demonstrated that 1-year survival was similar in both genders diagnosed at stages I and II while females had a survival advantage in stages III and IV. Moreover, they claimed comparable 5-year survival for both males and females diagnosed with I, III, and IV stages, yet females had better survival for stage II17. However, these contradictory results may be owing to White et al. adjusting the data for age, without adjustment for demographic factors such as ethnicity and socioeconomic status which may be connected to the CRC detection and outcome including survival20,21,22,23. A recently published population-based study from the SEER database revealed that metastatic EOCRC had longer survival than metastatic late-onset CRC patients (p < 0.0001). In line with our findings, Ren et al. illustrated that females had superior survival rates than males among metastatic EOCRC (p < 0.001). However, they did not find a significant difference in the metastatic late-onset CRC (p = 0.57). They also concluded that sex-related differences in metastatic CRC survival correlate to patients’ age4.

Incidence and mortality rates are considered higher in men than in women24. Molecular and genetic factors, sex hormones, and lifestyle may be attributed to the favorable survival in females than males25,26. The increased vulnerability of males can be partially attributed to worse health choices such as smoking27 and heavier alcohol consumption rates compared to females28. Moreover, men are more inclined to consume a fatty diet and processed meat29. They also tend to develop visceral obesity30 which is considered a potential risk factor for CRC. A meta-analysis showed that CRC risk increases by 7% for each 2 km/m2 increase in the Body Mass Index (BMI) and increases by 4% for each 2 cm increase in waist circumference31, which is consistent with Bassett et al. findings32. The accumulation of all these risk factors in males might explain the worse NCSS observed in our study owing to the increased risk of non-cancer mortality due to worse health choices.

Li et al. also proposed that sex differences can be attributed to male-specific genes that are carried on the Y-chromosome and can be a determinant for CRC hallmarks and outcomes. They generated a murine CRC model engineered with an induced transgene encoding mutant KRAS oncogene (KRAS*) and conditional null alleles of Apc and Trp53 tumor suppressors. They found higher metastasis rates and shorter survival in males compared to females. Furthermore, the molecular and transcriptomic analysis revealed that KRAS* mediated Signal Transducer and Activator of Transcription 4 (STAT4) transcription factor activation leading to the upregulation of one of the histone demethylases, Lysine Demethylase 5D (KDM5D) gene, encoded in the Y-chromosome. In turn, these transcriptomic changes repress genes regulating cell–cell junction integrity and CD8+ T Cell anti-tumor function. Interestingly, KDM5D deletion from cancer cells may decrease cancer invasiveness and enhance CD8+ T cell-killing activity; hence, it can be a promising therapeutic approach for CRC male patients expressing mutant KRAS33. Adding to the possible role of the immune system in EOCRC, a study by Ugai T et al. examined immune cell profiles in CRC patients. Comprehensive immunologic analyses following surgical resection in EOCRC patients were characterized by lower levels of tumor-infiltrating lymphocytes, intratumoral, periglandular, and peritumoral lymphocytic reaction. These findings underscore the importance of immune cell profile analysis based on age at diagnosis to better understand the unique pathogenesis of CRC in young adults34. Furthermore, some studies identified that sex is a considerable determinant of immunity35. For instance, estrogen has been identified as a regulator for the immune microenvironment of liver metastasis36; meanwhile, Schalper et found estradiol (E2) to increase programmed death ligand (PD-L1) expression in breast and endometrial cancer; hence, allowing cancer to escape immunosurveillance37. In this context, the IMMUNOREACT 5 trial investigated the difference in rectal mucosa immune microenvironment between both sexes. They observed that male patients have more mutations of SYNE1 and RYR2 oncogenes associated with lower expression of genes mediating T-cell activation. On the other hand, healthy female mucosa had more Th1 and cytotoxic T cells suggesting probably a better immune response against tumor cells38.

Lin et al. illustrated that high testosterone level was associated with decreased risk for CRC in men (RR 0.62; 95% CI 0.40–0.96). However, there is an inverse association between estradiol to testosterone ratio and CRC in postmenopausal women39. This could possibly explain the findings of Ren et al. concerning EOCRC having longer survival by eight months compared to LOCRC4, supporting that estrogen may have a protective effect in CRC. Nevertheless, researchers found that estrogen protects against microsatellite instability (MSI) and gene methylation in colon tumors. Therefore, females are less likely to develop MSI + colon cancer at a younger age during their reproductive period than males while being at higher risk of developing unstable tumors at older ages owing to the reduction of estrogen levels40. Still, benefits from hormonal replacement therapy, particularly in postmenopausal women, may be attenuated with breast cancer, cardiovascular diseases, and thromboembolism41. Estrogen receptor 1 (ESR1) is primarily expressed in breast cancer and promotes metastasis. Unlike ESR1, estrogen receptor 2 (ESR2) is expressed in CRC and is associated with tumor suppression. Liu et al.42 demonstrated that the WAP Four-Disulfide Core Domain 3 (WFDC3) gene regulates ESR2 which, in turn, represses Transforming Growth Factor Beta Receptor 1 (TGFBR1) and inhibits CRC metastasis42. Hence, targeting the ESR2 pathway can be a promising therapeutic approach42. However, further investigations to understand the underlying mechanism of this pathway and develop new therapeutic hits.

Previously reported evidence may justify our findings of worse survivals in male patients and across all subgroups, owing to behavioral, genetic, and immunological factors, as well as sex hormones. Our analysis confirmed that sex is an independent prognostic factor in EOCRC. Sex differences may comprise a fundamental difference between both sexes in terms of pathogenesis and response to treatment. So, we recommend further controlled clinical trials that assess responses to different interventions between males and females separately. Moreover, evaluating the existence of some biological biomarkers may direct treatment decisions and even predict survival. Overall, we believe understanding the epidemiological and molecular basis of sex differences in EOCRC will enable targeted and precision medicine; hence, reducing EOCRC’s emerging burden on communities. Current therapeutic guidelines recommend the use of aggressive treatment regimens for EOCRC patients43,44. A recent multi-center analysis showed that EOCRC patients benefit at least the same as, or even more than, older-onset CRC from CRC-directed treatment modalities45.

To our knowledge, the current research is the first to investigate sex differences in EOCRC survival with PSM analysis and sex-specific predictors of survival from US population-level data. Our analysis revealed that radiotherapy predicts better survival in both sexes. Other demographic and patient characteristics were almost comparable, with some exceptions. However, we think more research is needed to confirm the reproducibility of the results. there is also a need for a more comprehensive sex-specific prediction model of survival that determines survival according to the molecular and hormonal basis of the disease. Admittedly, our study has its limitations such as possible biases introduced by its retrospective nature. Our analysis did not take into consideration certain patients’ information such as detailed medical history and comorbidities, social status, and impactful health habits as these data were not available in the SEER database. Additionally, in the current study, we defined EOCRC as patients diagnosed between the ages of 20 and 49 which may be inconsistent with some literature that used the age of 15 as a lower cut-off. Furthermore, we used the SEER staging variable which is known for most of the patients in the database to limit missing data and maintain a large sample size in our study bypassing any possible inconsistencies between different American Joint Committee on Cancer (AJCC) staging editions for patients in the SEER database. Future studies should account for AJCC staging which is more clinically widely acceptable.

Conclusions

To conclude, we found significant differences in baseline variables between females and males at the time of EOCRC diagnosis. Additionally, male sex was associated with worse OS, CSS, and NCSS in both the general cohort and the post-PSM dataset. Furthermore, this analysis found that the majority of prognostic factors impacting survival outcomes of males and females diagnosed with EOCRC are comparable except for some minor differences.