Background

Alcohol consumption has consistently been recognized as a risk factor for breast cancer [1, 2]. Previous epidemiologic studies of the association between alcohol consumption and breast cancer risk in the United States have largely been conducted in persons of European ancestry. An analysis of 5108 cases of invasive breast cancer from the African American Breast Cancer Epidemiology and Risk (AMBER) Consortium indicated that African American women who reported current drinking of ≥ 14 drinks per week had higher odds of invasive breast cancer compared with light drinkers (> 0 to < 4 drinks per week) [adjusted OR (AOR) (95% CI) = 1.33 (1.07–1.64)], while those who reported drinking ≥ 7 drinks per week had higher odds of human epidermal growth factor receptor 2 negative (HER2-) breast cancer [AOR (95% CI) = 1.36 (1.09–1.70) [3].

There are many possible biological mechanisms for the association of alcohol consumption and breast cancer, including ethanol metabolism, increased levels of circulating estrogen, cellular proliferation, impact on DNA repair, and interference with the absorption and metabolism of nutrients such as folate and carotenoids [2, 4]. Alcohol metabolism pathways involve two key enzymes: alcohol dehydrogenase (ADH), which oxidizes alcohol to acetaldehyde—a reaction that produces a carcinogen as well as reactive oxygen species [5] that can damage DNA; and aldehyde dehydrogenase (ALDH), which converts acetaldehyde to acetic acid [6]. Other enzymes involved in ethanol metabolism include those in the microsomal ethanol-oxidizing system, Cytochrome P450 2E1 (CYP2E1) and catalase (CAT). CYP2E1 only plays a role in ethanol metabolism for heavy drinkers, while CAT metabolizes only a small proportion of alcohol consumed [6]. ADH is encoded by at least 7 genes in humans, which are clustered together on the long arm of chromosome 4, and, except for ADH7, are highly expressed in the liver [7]. ALDH enzymes are encoded by at least 19 genes spread across 12 chromosomes, five of which are highly expressed in liver tissue (ALDH1L1, ALDH2, ALDH4A1, ALDH6A1, and ALDH8A1) [8], though the two primary enzymes involved in acetaldehyde metabolism during ethanol oxidation are ALDH1 (encoded by ALDH1A1 on chromosome 9) and ALDH2 (encoded by ALDH2 on chromosome 12). CYP2E1 and CAT are also highly expressed in liver tissue.

Candidate studies of genes involved in alcohol metabolism have yielded mixed results regarding the association of single nucleotide polymorphisms (SNPs) and risk of breast cancer and the interaction of SNPs and alcohol consumption. Earlier genetic studies found that rs1229984 in ADH1B [9] and rs698 in ADH1C [10] modify the alcohol association with breast cancer, particularly in pre-menopausal women [10]. Studies of Korean women found a significant interaction for breast cancer risk between rs2031920 (CYP2E1*5) in CYP2E1 and rs671 in ALDH2 and alcohol intake [11, 12]. A study on post-menopausal women from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial found significant interactions between rs1229984-GG in ADH1B and all levels of alcohol intake and risk of breast cancer [13]. Several other large studies, however, indicate that SNPs in ADH1B, ADH1C, and CYP2E1 do not modify associations of alcohol intake with breast cancer risk [14,15,16,17].

Genome-wide association studies (GWAS) have identified a number of low-risk alleles for breast cancer, though none have directly implicated genes related to alcohol metabolism [18], and generally did not account for alcohol consumption [19,20,21,22]. Like many epidemiological studies of breast cancer, most GWAS and candidate gene studies to date have predominantly included individuals of European or Asian ancestry [9, 13, 23,24,25,26,27,28,29,30,31,32]. To fill this important gap in knowledge regarding potential disparities, we examined the association of ethanol metabolism pathway genetic variants and SNP-alcohol consumption interactions with odds of breast cancer (overall and among cancers with different hormone receptor status) using data from the African American Breast Cancer Epidemiology and Risk (AMBER) Consortium.

Methods

The AMBER Consortium is a collaboration among the largest etiologic studies of breast cancer in U.S. Black women. For the present analysis, data were drawn from two case–control studies: the Carolina Breast Cancer Study (CBCS) [33] and the Women’s Circle of Health Study (WCHS) [34, 35]; and the prospective cohort Black Women’s Health Study (BWHS) [36, 37]. All study participants provided informed consent prior to participation, and all studies obtained IRB approval from their respective institutions. A total of 3663 cases and 4687 controls provided either blood or saliva for DNA analysis by the AMBER consortium.

For CBCS, population controls were identified through Division of Motor Vehicles (age < 65 years) or Health Care Financing Administration lists (age ≥ 65). For WCHS, population controls were recruited through random digit dialing and community events. The BWHS was sampled as a nested case–control study from the parent cohort, with controls matched to breast cancer cases on 5-year age group, geographic location, and most recent questionnaire completed [38]. Cases were women diagnosed with incident invasive breast cancer or ductal carcinoma in situ (DCIS). Estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor-2 (HER2) oncogene expression status were determined using hospital or cancer registry pathology records.

Genotyping and quality control

Genotyping of DNA from participants in the BWHS, CBCS, and WCHS for the AMBER Consortium was performed in two phases. In our Phase 1 discovery (2014), genotyping of 6,860 AMBER participants was completed by the Center for Inherited Disease Research (CIDR) at Johns Hopkins University using the Illumina Human Exome Beadchip v1.1. This array includes > 240,000 coding variants: > 4500 variants in genes influencing drug metabolism, > 4400 immune system function variants, > 4500 GWAS variants, > 700 eQTLs, as well as > 200,000 variants from the COSMIC catalog of somatic mutations in cancer. AMBER custom content included 160,440 SNPs in 433 genes in breast cancer relevant pathways (e.g., fibroblast growth factor receptor 2 (FGFR2), steroid hormone metabolism, and vitamin D). Of the 246,519 SNPs genotyped, 231,705 autosomal SNPs passed quality control (QC) (call rates < 0.98, Hardy–Weinberg Equilibrium p < 1 × 10−4, or > 2 discordant calls in duplicate samples). 6828 AMBER participants passed QC, including 3130 cases and 3698 controls. Imputation of the genotype data to the 1000 Genomes Phase 1 reference panel was performed by the University of Washington using IMPUTE2 [39, 40]. Measured and imputed genotypes from three AMBER studies (CBCS, BWHS, and WCHS) were combined into a final Phase 1 discovery data set of up to 1847 women, including 597 cases and 1250 controls who reported being current drinkers and had phenotype data. Principal components were calculated using EIGENSOFT [41, 42] based on ~ 42,000 common SNPs. Principal components associated with case status (p < 0.1) after controlling for study, DNA source, and matching variables were included in our analyses [43].

In our Phase 2 validation, genome-wide genotyping of DNA samples from 4085 BWHS, CBCS, and WCHS participants was performed by CIDR using the Illumina Multi-Ethnic Global Array (MEGA) (2,036,060 SNPs) with addition of the panel of custom variants used in Phase 1. The MEGA and custom panel data were then combined for QC and analysis. After exclusion of duplicate samples, prevalent cases, and those missing questionnaire data, genotypes for 3999 AMBER participants passed initial CIDR QC and were sent to the Genetic Analysis Center (GAC) at the University of Washington for additional quality control. Cross-phase concordance was examined using data for 122 participants genotyped for 362,014 SNPs in both phases. The mean concordance across all overlapping samples was 0.9999. The mean concordance across all overlapping SNPs was 0.9999, with just 30 SNPs with > 10% discordance across phases, and 12 SNPs with > 90% discordance. Individuals with missing call rates > 0.03 were excluded from analyses. Genotyped SNPs with call rates < 0.98, Hardy–Weinberg equilibrium p < 1 × 10−4, > 1 discordant call in duplicate samples, monomorphic SNPs, and those with MAF < 0.01 across all samples were excluded, leaving 1,182,802 genotyped SNPs. All 3999 Phase 2 samples were then imputed by the GAC using the 1000 Genomes Phase 3v4 cosmopolitan reference panel in IMPUTE2 for SNPs with a minor allele count (MAC) ≥ 2 in either African or European 1000 Genomes super populations, resulting in 34,296,243 imputed variants. For this analysis, measured and imputed genotypes from two AMBER studies (BWHS and WCHS) were combined into a final Phase 2 data set of up to 1042 women, up to 118 invasive cases and 924 controls who reported being current drinkers with phenotype data. CBCS participants genotyped under Phase 2 were not included in the analysis due to a lack of genotyped controls.

Analytic sample

For analysis, we selected all genotyped and imputed variants in a ± 20 kb window of four known alcohol metabolism genes or clusters (ADH on chromosome 4, ALDH1 on chromosome 9, CYP2E1 on chromosome 10, and ALDH2 on chromosome 12), resulting in 18,792 variants in Phase 1 discovery and 46,519 variants in Phase 2 validation after QC exclusions, and up to 23,247 variants in the meta-analysis. Among the 3122 participants who reported being current drinkers at time of diagnosis and had genetic data from either phase, we excluded those with DCIS (n = 172) or for whom the nature of the tumor was unknown (n = 60) and the single CBCS control genotyped in Phase 2 who reported being a current drinker, leaving 2889 participants (715 cases and 2174 controls) for our Phase 1 + Phase 2 meta-analyses.

Hormone receptor status

In this analysis, we examined only invasive breast cancers, and further subdivided them by hormone receptor status: ER positive (393 cases), ER negative (275 cases), PR positive (321 cases), PR negative (345 cases), HER2 positive (114 cases), HER2 negative (458 cases), triple negative (163 cases), and non-triple negative (465 cases).

Alcohol intake assessment

Alcohol intake was self-reported for each study via questionnaire. BWHS assessed type of alcohol consumption (drinks per day, week, and month) at baseline, with amount consumed assessed in bi-annual follow-up surveys. CBCS assessed alcohol consumption amount by age range (< 25, 25 to 49, and ≥ 50) for each participant, and WCHS recorded amount of alcohol consumed (drinks per day) for each decade of life. To increase power for GxE analyses, we created a categorical variable representing participants who were consumed < 7 drinks per week or ≥ 7 drinks per week to approximate alcohol exposure following the NIAAA definition of moderate alcohol consumption for women [44]. Additional file 1: Table S1 shows the number of invasive cases and controls for each phase of the AMBER study, as well as the total number of cases and controls.

Statistical analyses

We first examined the association between alcohol consumption (< 7 vs. ≥ 7 drinks per week) on the odds of breast cancer for each hormone receptor status in the combined sample (Phase 1 + Phase 2), using a logistic regression model adjusted for 10-year age categories and study (Model 1), followed by a model additionally adjusted for age at menarche, body mass index (BMI), parity, smoking status (never, current, or former smoker), menopausal status (pre- or post-), level of education, and duration of oral contraceptive use (Model 2) [3].

Single variant association analyses were conducted separately in each phase assuming an additive genetic model using weighted generalized estimating equations (GEE) logistic regression as implemented in SUGEN to account for relatedness in the sample [45].We estimated SNP main effects, SNP* ≥ 7 drinks per week interaction effects, and joint two degree of freedom (2df) (SNP + SNP*alcohol interaction) effects. The 2df test jointly tests the null hypothesis that both the SNP main effect and SNP* ≥ 7 drinks per week interaction effect are equal to zero [46,47,48,49,50]. Previous GxE GWAS demonstrated the utility of a joint genetic plus interaction test to identify variants where the environmental exposure may not have opposite directions of effect on the outcome, but does show a difference in magnitude of effect between exposure strata [47,48,49]. We ran case–control models among (1) all women (597 invasive cases and 1250 controls in Phase 1 and 118 invasive cases and 924 controls in Phase2), and (2) each hormone receptor status and all controls. In Phase 1, models were adjusted for age group (by ~ 10-year intervals), study site, geographic region, DNA source, principal components, and a weighted variance estimator to account for relatedness among the ~ 250 2nd degree or closer relatives in Phase 1. Phase 2 models were adjusted for age group (by ~ 10-year intervals), study site, DNA source, and principal components. Phase specific results were excluded if imputation quality (r2 < 0.4) or effective sample size among cases (effNcase < 10), calculated per SNP as (2*MAFcase)*(1-MAFcase)*Ncase*imputation quality. To increase power to detect potential interaction effects (Additional file 2), phase specific results were then meta-analyzed, using a fixed effects inverse variance weighted meta-analysis implemented in METAL [51]. To account for multiple comparisons, we determined the total number of independent variants in each locus using the linkage disequilibrium (LD) pruning function in Plink 1.9 with the following parameters: variant window size = 50, variant window shift = 5, and r2 = 0.1. We identified 334 independent variants in the ADH locus on chromosome 4, 253 in the ALDH1 locus on chromosome 9178 in the CYP2E1 locus on chromosome 10, and 209 in the ALDH2 locus on chromosome 12 and calculated the p-value threshold required for statistical significance as 0.05/number of independent variants per locus (chr 4: p < 1.5 × 10–4, chr 9: p < 1.98 × 10–4, chr 10: p < 2.81 × 10–4, chr 12: p < 2.39 × 10–4). Genetic effects of variants reaching locus-specific statistical significance in either the interaction or joint 2df test were analyzed separately for participants who reported drinking ≥ 7 drinks per week and those who reported drinking < 7 drinks per week.

Results

In our sample of current drinkers, 21.0% of cases and 13.7% of controls reported consuming ≥ 7 drinks per week. Descriptive characteristics for participants from each study, including details of alcohol consumption, smoking status, use of oral contraceptives, age at menarche, parity, and menopausal status are provided in Additional file 1: Table S1.

Results of our analysis of the association between frequency of alcohol consumption among current drinkers and odds of breast cancer by hormone receptor status are shown in Additional file 1, Table 1. In both age-adjusted (Model 1) and multivariable models (Model 2), heavier alcohol consumption was associated with an increased odds of any invasive breast cancer [multivariable OR (95% CI): 1.30 (1.00–1.68)], ER- breast cancer [OR (95% CI): 1.48 (1.01–2.14)] and PR- breast cancer [OR (95% CI): 1.50 (1.07–2.10)]; the comparable OR for ER + breast cancer was 1.33 (0.98–1.82)).

Table 1 Associations between alcohol consumption and breast cancer subtypes

In the genetic main effects meta-analysis, we identified multiple testing-corrected statistically significant per-allele associations (see Additional file 1: Table S3, for locus-specific significance thresholds) between rs79865122-C in the CYP2E1 locus on chromosome 10 and odds of ER- (OR (95% CI) = 0.21 (0.12, 0.40), pSNP = 9.91 × 10–7), PR- (OR (95% CI) = 0.27 (0.15, 0.50), pSNP = 1.87 × 10–5), and triple negative breast cancer (OR (95% CI) = 0.23 (0.11, 0.48), pSNP = 1.15 × 10–4). -. This variant was also significantly associated with odds of ER- and PR- breast cancer using the joint test [ER- OR≥7dpw (95% CI) = 3.92 (0.28, 55.41), OR<7dpw = 0.24 (0.13, 0.44), pjoint = 3.74 × 10−6; PR- OR≥7dpw (95% CI) = 1.77 (0.29, 10.87), OR<7dpw (95% CI) = 0.31 (0.17, 0.55), pjoint = 9.52 × 10–5] (Additional file 1, Table 2). We also found a statistically significant SNP*alcohol consumption interaction between triple negative breast cancer and rs3858704-A in the ALDH2 locus on chromosome 12 [ORint (95% CI) = 6.28 (2.50,15.73), pint = 8.97 × 10–5]. Stratified analysis of the ALDH2 interaction revealed an increased odds of triple negative breast cancer in women who consumed ≥ 7 drinks per week (OR (95% CI) = 4.41 (1.79, 10.86), p = 1.26 × 10–3), and decreased odds in women who consumed < 7 drinks per week (OR (95% CI) = 0.57 (0.36, 0.89), p = 0.013). These associations were driven by Phase 1, as rs79865122 was filtered out of the Phase 2 results due to EffNcases < 2, and rs3858704 was not present in Phase 2. Using LD Link, we identified rs78085062 as in LD with rs3858704 in 1000 Genomes AFR + EUR populations (r2 = 0.986). This variant was also not present in our Phase 2 results, nor were 6 other variants with r2 > 0.7. Significant results for each phase are presented in Additional file 1: Table S2.

Table 2 Summary of association results for loci with locus-wide significance for interaction

We also examined known variants and proxies with r2 ≥ 0.8 in four alcohol metabolism gene regions (ADH, CYP2E1, ALDH1, and ALDH2) in our meta-analysis results for main genetic effects on invasive breast cancer. For each known variant extracted from the literature, we identified proxies in 1000 Genomes Phase 3 African and European populations using LD Link [52], and then extracted results for those known variants or their proxies from our meta-analysis of invasive cases. The variant or proxies with the lowest p-value are reported in Additional file 1, Table 3. For the ADH region on chromosome 4, we identified 14 known variants. Of those 14 known variants or proxies, rs2075633 was nominally significant in our meta-analysis results and directionally consistent in both phases [ORSNP (95% CI) = 1.37 (1.06, 16.85), pSNP = 0.016). Known variants rs1229984, rs2066702 and rs698 in the ADH region were not present in AMBER, but we were able to identify proxies with r2 ≥ 0.8 in 1000 Genomes Phase 3 African ancestry populations for rs2066702 and rs698 in our data, though the associations were not statistically significant. In the ALDH1 region on chromosome 9, we identified 9 known variants previously examined for interaction effects of alcohol *mortality risk after breast cancer diagnosis [53]. None of these were statistically significant in our results. For the CYP2E1 region on chromosome 10, we identified three previously reported variants, but none were significantly associated with odds of invasive breast cancer in our sample. For rs2031920, this is consistent with some previous studies [16, 54]. One known variant in the ALDH2 region of chromosome 12, rs671, was not present in AMBER, and there were no proxies with r2 ≥ 0.08 in our data.

Table 3 Lookups of known alcohol risk variant (or proxies from 1000 Genomes Phase 3 AFR populations) associations with invasive breast cancer in the AMBER cohort

Discussion

Alcohol consumption has been associated with a moderately increased risk of breast cancer in women [2]. In addition, variants in genes involved in alcohol metabolism, such as ADH and ALDH, have been associated with an increased risk of cancer, including breast cancer, in some studies [2]. Previous investigations of alcohol gene and gene-exposure interaction have been primarily based on studies of individuals of European or Asian ancestry [9, 13, 23,24,25,26,27,28,29,30,31,32]. Our analysis of ethanol metabolism pathway genetic variants and SNP*alcohol consumption interactions and the odds of breast cancer was based on the African American Breast Cancer Epidemiology and Risk (AMBER) Consortium, a collaboration among four of the largest epidemiologic studies of breast cancer in African American women.

In the present analysis, we identified a positive association of consumption of ≥ 7 drinks per week with odds of all invasive breast cancer, as well as of ER and PR- breast cancer. The interaction of rs3858704-A in the ALDH2 region of chromosome 12 with consumption of ≥ 7 drinks per week was significant in Phase 1, but neither that variant nor any proxies with r2 ≥ 0.7 were available in the Phase 2 data (Additional file 1: Table S2).We found a previously unreported association of rs79865122-C on chromosome 10 near CYP2E1 with odds of ER- and PR- breast cancer, including a statistically significant joint main plus interaction effect. CYP2E1 has been shown to have a lesser contribution to ethanol metabolism [55] than ADH or ALDH. At high ethanol concentrations, however, the ADH pathway becomes saturated, and activity of the microsomal ethanol oxidizing system (MEOS) pathway increases. As part of the MEOS, CYP2E1 metabolizes ethanol, and the process yields free radicals leading to oxidative stress [55]. In addition, elevated ethanol levels can interfere with the ability of CYP2E1 to metabolize other substrates, such as medications, resulting in reduced clearance and elevated drug concentrations [55]. Multiple polymorphisms of CYP2E1 have been identified, some of which are rare in populations of European ancestry, and some appear to have functional consequences on ethanol metabolism [11, 16]. One CYP2E1 restriction fragment length polymorphism (RFLP) (CYP2E1*1D) has been shown to have a higher prevalence in women of African ancestry, and functional impact on in vivo metabolic activity in the presence of exposures known to increase expression of CYP2E1 [obesity or recent (within 72 h) alcohol consumption] [56]. We were unable to locate any previous publications including a specific analysis of alcohol metabolism gene and gene-alcohol consumption effects on breast cancer risk among Black women [9, 10, 14, 15, 17].

Our analysis focused on the four major human alcohol metabolism genes or gene clusters, including ADH, CYP2E1, ALDH1, and ALDH2. Variants in these genes have been previously investigated for influencing ethanol metabolism or modifying the effect of alcohol on breast cancer risk, including ADH1B*2 (rs1229984), ADH1B*3 (rs2066702), ADH1C*1 (rs698), CYP2E1*5 (rs2031920-T, rs3813867-C, and rs6413432-A), CYP2E1*6 (rs6413432-A) and ALDH2*2 (rs671) [53, 57]. Some of the variants show population frequency differences, for example, East Asian populations have a higher frequency of ADH1B*2 (rs1229984-T) and ALDH2*2 (rs671-A) than other populations [58]. The ADH1B*2 (rs1229984) and ALDH2*2 (rs671) variants are fixed or very low frequency (minor allele frequency (MAF) ≤ 1%) in 1000 Genomes Phase 3 populations of African ancestry, while the ADH1B*3 (rs2066702) and ADH1C*1 (rs698) variants are common (MAF 6.94–28.24%) in these groups. In our lookups of known variants with odds of invasive breast cancer in our meta-analysis results, we replicated the association at rs2075633 in the ADH region on chromosome 4, which was nominally significant in our meta-analysis results and directionally consistent in both phases [ORSNP (95% CI) = 1.37 (1.06, 16.85), pSNP = 0.016).

This study had several strengths including a focus on breast cancer risk among African American women, breast cancer hormone receptor status information, the use of the MEGA array in Phase 2, comprehensive coverage of genomic regions containing ethanol metabolism genes and gene clusters, and examination of gene-alcohol interactions using both interaction and joint main effects + interaction models.

The analysis also had some limitations. Although the Phase 2 chip included variants designed to capture the genetic variation of global populations [59], compared to earlier genotyping chips that were designed to capture genetic variation in European populations, only 118 cases and 924 controls were genotyped on the Phase 2 chip, as opposed to 597 cases and 1250 controls on the Phase 1 chip. The Illumina Human Exome Beadchip v1.1 was used for Phase 1 genotyping, resulting in fewer SNPs included in those analyses compared to Phase 2 (18,792 variants in Phase 1 versus 46,519 variants in Phase 2). In this analysis, we also focused primarily on genetic factors and have not placed this work in the context of multilevel determinants of breast cancer risk, including social determinants and structural racism [60,61,62,63]. Exposure status and information on other covariates were self-reported, which would increase the possibility of recall bias in the case-controls studies, but not in the prospective BWHS. In addition, the number of women who reported consuming ≥ 7 drinks/week was limited (150 cases and 298 controls). This is consistent with reported exposure patterns, notably the lower frequency of heavy drinking in Black women compared to white women [64], which may have influenced our power in this study relative to similarly sized studies of white women (see Additional file 2). Finally, while we attempted to account for many other variables that may influence both breast cancer risk and alcohol consumption, including age at menarche, parity, menopausal status, BMI, smoking behavior, and use of oral contraceptives, others, such as diet and physical activity, were not included in our analysis.

Conclusions

In the present study, we examined the relationship between genetic variation in key ethanol metabolism genes with odds of breast cancer. We also evaluated interactions between alcohol intake and genetic variation on odds of breast cancer. As has been reported in a previous AMBER analysis [3], we identified a significant association between consumption of ≥ 7 drinks per week and ER- and PR- breast cancer in our minimally adjusted models, and the associations remained significant in our fully adjusted models. We found a statistically significant joint association with rs79865122 in the CYP2E1 region of chromosome 10 with ER- and PR- and breast cancer, potentially pointing to genomic regions influencing the associations identified in our exposure-outcome models examining the impact of consumption of ≥ 7 drinks per week on odds of breast cancer. Although we used the largest available genetics resource relevant to breast cancer in Black women, additional research to further validate the interaction of genetic variants in alcohol metabolism genes and alcohol consumption on odds of breast cancer is warranted.