Abstract
Individuals with type 1 diabetes (T1D) carry a markedly increased risk of stroke, with distinct clinical and neuroimaging characteristics as compared to those without diabetes. Using whole-exome or whole-genome sequencing of 1,051 individuals with T1D, we aimed to find rare and low-frequency genomic variants associated with stroke in T1D. We analysed the genome comprehensively with single-variant analyses, gene aggregate analyses, and aggregate analyses on genomic windows, enhancers and promoters. In addition, we attempted replication in T1D using a genome-wide association study (N = 3,945) and direct genotyping (N = 3,263), and in the general population from the large-scale population-wide FinnGen project and UK Biobank summary statistics. We identified a rare missense variant on SREBF1 exome-wide significantly associated with stroke (rs114001633, p.Pro227Leu, p-value = 7.30 × 10–8), which replicated for hemorrhagic stroke in T1D. Using gene aggregate analysis, we identified exome-wide significant genes: ANK1 and LRRN1 displayed replication evidence in T1D, and LRRN1, HAS1 and UACA in the general population (UK Biobank). Furthermore, we performed sliding-window analyses and identified 14 genome-wide significant windows for stroke on 4q33-34.1, of which two replicated in T1D, and a suggestive genomic window on LINC01500, which replicated in T1D. Finally, we identified a suggestively stroke-associated TRPM2-AS promoter (p-value = 5.78 × 10–6) with borderline significant replication in T1D, which we validated with an in vitro cell-based assay. Due to the rarity of the identified genetic variants, future replication of the genomic regions represented here is required with sequencing of individuals with T1D. Nevertheless, we here report the first genome-wide analysis on stroke in individuals with diabetes.
Similar content being viewed by others
Introduction
Stroke is a notable cause of mortality and long-term disability worldwide, with diabetes among the most important risk factors. The standardized incidence ratio is roughly sixfold among individuals with type 1 diabetes (T1D) compared to the general population1. Furthermore, 537 million adults live with diabetes today and the prevalence is rising2. Even though much of this trend is driven by an increase in obesity and insulin-resistant type 2 diabetes (T2D), the incidence of insulin-dependent T1D has increased as well3. T1D is a lifelong condition caused by an autoimmune reaction towards the pancreas and treated with daily insulin injections. The strokes themselves may be of hemorrhagic (20%) or ischemic (80%) origin and classified into even more specific subtypes. Interestingly, the two diabetes types affect stroke risk differentially: T1D increases the risk of both ischemic- and hemorrhagic stroke4,5, while the risk imposed by T2D has been estimated more modest for hemorrhagic strokes5. Importantly, T1D predisposes individuals to cerebral small-vessel disease and strokes of microvascular origin6,7. Diabetes causes also other complications, of which diabetic kidney disease (DKD) and severe retinopathy predict cerebrovascular disease in T1D8. Understanding stroke pathophysiology in diabetes is important for improving treatment and quality of life for individuals with T1D.
Stroke heritability has been estimated to vary between 30 and 40% in the general population9. Stroke heritability varies greatly depending on the subtype, with the largest heritability estimates for large artery atherosclerotic stroke and lobar intracranial hemorrhage, and the lowest for small vessel disease9. To date, 126 common genomic loci have been associated with stroke or its subtypes with genome-wide significance10,11. Associations at many of the known common stroke loci overlap with other cardiovascular phenotypes, e.g., coronary artery disease (CAD)9. Our previous study suggested a heritable component of stroke in individuals with T1D as a history of maternal stroke was associated with hemorrhagic stroke in T1D12. However, very few studies have investigated genetic risk factors for stroke in diabetes13,14,15, and no genome-wide studies in individuals with diabetes yet exist. On the other hand, genetic studies on CAD in diabetes have identified a few diabetes-specific loci16,17, although still pending external replication, and have replicated three known general population CAD risk loci in diabetes: CDKN2B-AS1, PSRC1 and LPA15,16,18.
A substantial proportion of heritability remains unexplained for stroke9. Rare genetic variants with minor allele frequency (MAF) of ≤ 1% may significantly contribute to stroke heritability. In fact, some rare monogenic disorders have stroke as one of their manifestations9,10,19. In GWASs, the imputation accuracy of rare variants may be limited, and largely depends on the minor allele count (MAC) in the reference sample20. Rare variants can be reliably studied with next-generation sequencing-based techniques such as whole-genome sequencing (WGS) and whole-exome sequencing (WES). We have previously used WES to identify protein coding variants associated with lipid and apolipoprotein traits in T1D21. In the general population, novel stroke risk loci have been identified with WGS22. However, UK Biobank WES analysis for cardiometabolic traits did not discover exome-wide significant stroke risk genes23.
Historically, the Finnish population has been isolated and, thus, represents a unique genetic background with enrichment of low-frequency deleterious variants24, which may in part enable the discovery of rare disease-associated variants. Here we studied genetics of stroke and its subtypes with WGS and WES in Finnish individuals with T1D with multiple statistical approaches by focusing on rare and low-frequency genomic variants. We aimed both to find stroke-risk loci specific to individuals with T1D, and to identify risk loci generalizable to the non-diabetic population, since discovery of rare variants is more probable in a high-risk Finnish diabetic population. Finally, we performed cell-based in vitro experiments to further validate a discovered promoter region. Altogether, here we report the first genome-wide study on stroke genetics in diabetes.
Results
Study design
The study is part of the Finnish Diabetic Nephropathy (FinnDiane) Study; an ongoing nationwide multicenter study established to identify factors leading to diabetic complications25. We studied WGS in 571 and WES in 480 non-related and non-overlapping individuals with T1D, entailing 112 and 74 stroke cases, respectively (Table 1, Table S1, Fig. S1 and S2). We aimed to find rare and low-frequency genetic variants associated with stroke in T1D. Therefore, we performed single variant analyses across the genome (MAC ≥ 5), using fixed-effects meta-analysis for variants available in both data sets, with a minimal adjustment setting i.e., the calendar year of diabetes onset, sex and two first genomic data principal components, and repeated the analyses with an additional DKD adjustment (Fig. 1). We performed gene aggregate analyses (cumulative MAC, CMAC ≥ 5) with the minimal adjustment separately with protein-altering variants (PAVs) and protein-truncating variants (PTVs); and repeated the analyses with an additional DKD adjustment. Finally, we conducted minimally adjusted intergenic aggregate analyses within genomic windows by statistically up-weighting functionally important and rare variants; and within established enhancers and promoters by weighting variants according to their rarity. Furthermore, we performed stroke subtype association analyses for the lead findings.
Single variant analyses
We sought for genetic variants associated with stroke using non-overlapping WES and WGS data, and discovered a suggestively stroke-associated locus, 4q33-34.1, with the minimally adjusted model (4:170787127, p-value = 8.83 × 10–8, MAF = 3.7%, Table 2, Fig. 2). The variant was unavailable for replication in the T1D specific GWAS and in the FinnGen general population GWAS summary statistics. However, the variant with the third lowest p-value on 4q33-34.1 was available but did not replicate for stroke in T1D nor the general population (Table 2). As DKD is a common diabetic complication that has been reported to predict incident stroke in T1D8, we performed additional analyses adjusted for DKD, and discovered a rare missense variant on SREBF1 exome-wide significantly (p-value < 3 × 10–7) associated with stroke (rs114001633, p.Pro227Leu, p-value = 7.30 × 10–8, MAF = 0.26%) (Table 2, Fig. S3). Due to the rarity of the variant, we performed additional genotyping for replication, whereby the variant did not replicate for stroke (Table 2), but replicated for hemorrhagic stroke in T1D (p-value = 0.02, N = 3,263, Table S2). Since rs114001633 did not pass MAC threshold in the hemorrhagic stroke sub-analysis of the discovery cohort (Table S2), further replication in additional individuals with T1D is needed to confirm the potential association with stroke, specifically with hemorrhagic stroke, in T1D.
Gene aggregate analyses
To improve statistical power for rare and low-frequency variants, we performed gene aggregate analyses. With the minimally adjusted models, low-frequency PAVs on ANK1 were associated with stroke (p-value = 2.23 × 10–6, CMAC = 247), even more strongly with ischemic stroke (p-value = 1.31 × 10–6, CMAC = 225) (Fig. 3A, Fig. S4, Tables 3, S3 and S4). Furthermore, nine genes were suggestively associated with stroke through rare or low-frequency PAVs (Fig. 3A). Of these, the aggregate of PAVs on TARBP2 was associated with ischemic stroke (p-value = 1.71 × 10–7, CMAC = 5, MAF ≤ 1%), and on CLEC4M with hemorrhagic stroke (p-value = 4.74 × 10–15, CMAC = 11, MAF ≤ 1%). Of note, rare PAVs on GCDH were suggestively associated with stroke (p-value = 3.26 × 10–5, CMAC = 6): GCDH loss-of-function variants have been previously associated with metabolic stroke and cerebral hemorrhage26.
With the models additionally adjusted for DKD, rare PAVs on LRRN1 were associated with stroke (p-value = 3.49 × 10–6, CMAC = 15), and suggestively with ischemic stroke (p-value = 8.69 × 10–6, CMAC = 12; Fig. 3B, Fig. S5, Tables 3, S3 and S5). Furthermore, eight genes were suggestively associated with stroke through rare or low-frequency PAVs (Fig. 3B). In the stroke subtype analysis for the lead genes, the aggregate of rare PAVs on MAP3K12 was associated with ischemic stroke (p-value = 1.72 × 10–7, CMAC = 17), and on MTRNR2L7 with hemorrhagic stroke (p-value = 2.24 × 10–6, CMAC = 6). MAP3K12 and TARBP2 are located close to each other on the genome, thus, they may represent the same association signal through linkage disequilibrium (LD) or modifier effects onto the causal gene (Fig. S6).
We then investigated the role of more severe PTVs, i.e. putative loss-of-function variants, for stroke. Low-frequency PTVs on ARPC5 were associated with stroke, while rare PTVs on HAS1 (i.e., hyaluronan synthase 1) were suggestively associated with stroke (Fig. 3A, Fig. S4, Tables 3 and S4). Furthermore, in the analysis for stroke subtypes, the aggregate of rare PTVs on HAS1 was associated with ischemic stroke (p-value = 7.39 × 10–7, CMAC = 7). With the additional DKD adjustment, rare PTVs on HAS1 (p-value = 3.11 × 10–5, CMAC = 7), rare PTVs on UACA (p-value = 6.77 × 10–5, CMAC = 6), and low-frequency PTVs on ARPC5 (p-value = 4.15 × 10–5, CMAC = 39), were associated with stroke (Fig. 3B, Fig. S5, Tables 3 and S5).
Replication of gene aggregate findings
We attempted T1D specific replication within the FinnDiane GWAS data, by including also five directly genotyped variants, both using the gene aggregate approach and by inspecting the exonic variants individually. Despite the uncertainty of genotype imputation and our limited statistical power for rare variants, ANK1 and LRRN1 showcased weak evidence of replication in T1D: Although ANK1 did not reach significance for stroke with SKAT-O (Tables 4 and S6), one of the available fifteen variants was associated with stroke (rs779805849, p-value = 0.017) (Table 3, Fig. 4), and two additional variants with hemorrhagic stroke (rs146416859 and rs61753679, p-value < 0.05) (Table S4). LRRN1 did not replicate for stroke in FinnDiane with rare PAVs (p-value = 0.50, Nvariant = 4) (Tables 4 and S7). However, when we extended the model to low-frequency PAVs (Tables 4 and S7), thus improved statistical power and imputation quality, LRRN1 replicated for ischemic stroke (p-value = 0.039, Nvariant = 6). UACA contained two rare PTVs associated with stroke, of which one replicated through genotyping (p-value = 0.0030, Tables 3 and S5). However, the variant was ultra-rare, and replication thus uncertain. We were unable to replicate HAS1 in T1D due to missing data; we directly genotyped one variant but found no rare allele carriers. ARPC5 did not replicate.
We further attempted replication in the general population by look-ups from two UK Biobank WES studies23,27 (Tables 4, S8 and S9). Importantly, HAS1 replicated for stroke with rare loss-of-function variants (MAF ≤ 1%: p-value = 0.03527) and with ultra-rare deleterious variants (MAF ≤ 0.1%: p-value = 0.01223), while UACA replicated with ultra-rare deleterious variants (MAF ≤ 0.01%: p-value = 0.03527). Finally, LRRN1 replicated for stroke with an ultra-rare deleterious variant model (MAF ≤ 0.001%: p-value = 0.02627), although not for ischemic stroke. ANK1 did not replicate with the deleterious missense variant model in the general population27.
Out of the suggestive genes, FOXO1, TARBP2, and MAP3K12 showcased weak replication in T1D (Tables 4, S4-S7). One variant within FOXO1 replicated for hemorrhagic stroke with the minimal adjustment (p-value = 0.012), two within MAP3K12 for hemorrhagic stroke with the additional DKD adjustment (p-value = 0.013); and TARBP2 replicated for hemorrhagic stroke with the minimally adjusted SKAT-O (p-value = 2.59 × 10–4, MAF ≤ 1%). UK Biobank general population gene burden WES analysis look-ups supported stroke associations for UTS2, MAP3K12, and FOXO1 (Tables 4 and S9)23,27.
Known Mendelian stroke genes in T1D
Variants on Mendelian stroke risk genes may for instance cause small vessel disease or cerebral cavernous malformations, which can eventually lead to stroke9. We inspected the association of 17 autosomal genes previously linked to stroke through nonsynonymous variants (ABCC6, KRIT1, ADA2, COL3A1, COL4A1, COL4A2, COLGALT1, HTRA1, NOTCH3, RNF213, TREX1, CCM2, PDCD10, CTSA, APP, CST3, ITM2B)19 (Fig. S7). Rare PAVs on KRIT1 were associated with stroke (p-value = 0.018) and ischemic stroke (p-value = 0.0092). Furthermore, rare PAVs on ADA2 and on TREX1 were associated with hemorrhagic stroke (p-value = 0.027 and p-value = 0.010, respectively). Loss-of-function variants on KRIT1 cause vascular malformations, while ADA2 has been linked to autoinflammatory small vessel vasculitis and TREX1 to small vessel disease9,19.
Sliding-window analyses
To increase statistical power for low-frequency and rare variants on non-coding regulatory regions, we performed genome-wide sliding-window aggregate analyses with the minimal adjustment. We found further evidence for the 4q33-34.1 genomic region as we discovered fourteen windows within the region, with a genome-wide significant association between an aggregate of low-frequency variants and stroke (MAF ≤ 5%; Fig. 5A, Table S10). Importantly, two of these windows (4:170782001–170786000, p-value = 3.40 × 10–8, CMAC = 934; and 4:170784001–170788000, p-value = 1.10 × 10–8, CMAC = 1190) and ten individual variants within the 4q33-34.1 genomic region replicated for stroke in T1D (FinnDiane GWAS: p-value < 0.05; Table S11). To identify the most likely effector genes for the 4q33-34.1, we inspected variant expression quantitative trait loci (eQTL) from GTEx Portal and eQTLGen Consortium28, and functional genomics from the 3D Genome Browser29. 4q33-34.1 is located in the same topologically associating domain with distal promoters of GALNTL6, MFAP3L and AADAT in the frontal lobe and hippocampus (Fig. S8). In addition, promoter capture high-throughput chromosome conformation capture (PCHi-C) links could be identified for a few individual variants, e.g., for GALNTL6 in the hippocampus, and AADAT and MFAP3L in the dorsolateral prefrontal cortex.
When we inspected rare variants (MAF ≤ 1%), we discovered multiple suggestive windows, e.g., close to or within the CNTN1, CNTN4, LINC01500, and TGOLN2 genes (Fig. 5B, Table S10). In stroke subtype analysis, the CNTN1 window was genome-wide significantly associated with hemorrhagic stroke (12:40950001–40954000: p-value = 2.10 × 10–8, CMAC = 24). Interestingly, CNTN1 and CNTN4 are located on different chromosomes, but belong to the same contactin protein family; however, replication is pending. The suggestive window near LINC01500 (14:59004001–59008000: p-value = 2.53 × 10–7, CMAC = 19) replicated for stroke in T1D (FinnDiane GWAS: p-value = 0.015, CMAC = 56). Four variants within the window were available in the FinnGen general population GWAS, and one replicated for stroke (rs1281241634, p-value = 0.029) (Table S11). According to PCHi-C, the LINC01500 intronic window looped to the DACT1 promoter on the dorsolateral prefrontal cortex (Fig. S9). Finally, the TGOLN2 window replicated for hemorrhagic stroke in T1D (FinnDiane GWAS: p-value = 0.037).
Promoters and enhancers
As a more targeted aggregate approach to explore the non-coding genome, we studied rare and low-frequency variants on established regulatory regions using the minimal adjustment. We discovered three enhancers with suggestive stroke-associated enrichment of rare or low-frequency variants within intronic regions of TRPM3, LOC105378983, and BDNF, encoding brain-derived neurotrophic factor (Tables S12 and S13, Fig. S10). The BDNF enhancer was significant after multiple testing correction for ischemic stroke (p-value = 1.01 × 10–6, CMAC = 6). Regional aggregate replications were not possible in the T1D specific GWAS (Nvariant < 2), and individual variants were missing or did not replicate. PCHi-C linked the BDNF enhancer to its promoter on specific brain regions (Fig. S9).
We did not identify stroke-associated promoters after correction for multiple testing (p-value < 3 × 10–7, Fig. S11). The strongest associations were two TGOLN2 promoters (p-value = 5.60 × 10–6, CMAC = 9, MAF ≤ 1%), located on the previously mentioned TGOLN2 window, and a TRPM2-AS promoter (p-value = 5.78 × 10–6, CMAC = 33, MAF ≤ 1%; Tables S14 and S15). The aggregate of rare variants on TRPM2-AS promoter nearly replicated for stroke in T1D (FinnDiane GWAS: p-value = 0.053). When we inspected variants individually, one out of nine available variants replicated in the general population for ischemic stroke (FinnGen GWAS: p-value = 0.038). In GTEx, rs762428 within the TRPM2-AS promoter associated significantly to TRPM2 level in whole blood (NES = -0.63) and lungs (NES = -0.41, p < 0.001), also nominally in other tissues such as the hypothalamus (NES = -0.42). TRPM2 encodes a calcium-permeable and non-selective cation channel expressed mainly in the brain. The gene has been linked to ischemic stroke30, and belongs to the same protein subfamily as the above mentioned TRPM3. TRPM2 inhibitors have been proposed as a drug target for central nervous system diseases31, thus, our results suggested that these inhibitors could be beneficial also for stroke in T1D, although further validation of the genetic associations are needed.
We performed luciferase promoter analysis of the stroke-associated sequence within the TRPM2-AS promoter region to experimentally confirm its promoter activity (Fig. 6). As we detected TRPM2-AS expression in HELA cells but not in HUVEC or HEK-293 cells using semi-quantitative RT-PCR, the luciferase analysis was performed in HELA cells, which indicated strong promoter activity. The most strongly stroke-associated variant, rs753589764, did not significantly affect luciferase activity under normal cell culture conditions (p-value = 0.27, 22 technical repeats). However, we cannot rule out a variant effect under cellular stress, e.g., oxidative stress, or in other cell lines, and therefore, further promoter experiments should be performed in future.
Discussion
Stroke heritability has been estimated to range between 30 and 40%, but the genomic loci identified thus far explain only a small fraction of heritability9. One potential explanation underlying the missing heritability are rare variants missed by GWAS. Therefore, we performed WES and WGS in a total of 1,051 Finnish individuals with T1D to discover rare and low-frequency variants associated with stroke and its major subtypes, either specific for T1D, or generalizable to the non-diabetic population. We identified multiple significant loci with evidence of replication, including protein altering or truncating variants on ANK1, HAS1, UACA, and LRRN1, as well as a 4q33-34.1 intergenic region.
With single variant analyses, we identified a missense variant on SREBF1 (rs114001633, p.Pro227Leu), which was exome-wide significantly associated with stroke, and further replicated for hemorrhagic stroke in T1D. As the variant was ultra-rare, and we had a relatively small number of hemorrhagic stroke cases, further replication is needed in T1D to conform this finding. SREBF1 encodes a transcription factor involved in lipid metabolism and insulin signaling32.
Gene aggregate tests (SKAT-O) detected four genes within which PAVs (ANK1 and LRRN1) or PTVs (HAS1 and UACA) were associated with stroke with evidence of replication; LRRN1, HAS1, and UACA after adjustment for DKD. ANK1 did not replicate in T1D with the gene aggregate approach, however, one out of the fifteen available variants replicated for stroke in T1D (rs779805849, p.Val136Glu). Of note, SIFT and PolyPhen predicted many ANK1 variants as deleterious33,34. ANK1 encodes ankyrin-1, within which variants cause hereditary spherocytosis, an inherited disease that changes the shape of red blood cells35. Previous genome-wide association studies have linked the gene to T2D36, while another gene from the ankyrin protein family, ANK2, is a previously identified stroke risk locus37.
Rare PAVs on LRRN1 were associated with stroke. LRRN1 did not replicate with the corresponding model in T1D, however; with a model extended to low-frequency PAVs, LRRN1 replicated for ischemic stroke. Rare variant replication is problematic with GWAS data due to the uncertainty of the imputation, which may explain the need of increasing the allele frequency threshold to observe a successful replication. Furthermore, LRRN1 was nominally associated with stroke in the general population through an aggregate of ultra-rare loss-of-funtion and deleterious missense variants27. LRRN1 encodes leucine rich repeat neuronal protein 1, with a brain-enriched expression profile.
HAS1 consistently replicated for stroke with rare loss-of-funtion and deleterious variant aggregate models in the general population23,27, while UACA replicated for stoke with one ultra-rare deleterious variant model27. HAS1 encodes an enzyme producing hyaluronan and with expression induced by inflammation and glycemic stress38. Of note, an increased hyaluronan turnover has been suggested to follow ischemic stroke39. No additional HAS1 PTV carriers were identified among the T1D replication cohort, thus, a diabetes-specific replication is pending. Nevertheless, HAS1 PTVs may be of particular importance in T1D, as dysregulation of endothelial glycocalyx hyaluronan has been suggested to contribute to diabetic complications40. Finally, it must be noted that PTVs have not been functionally confirmed as loss-of-function, but the annotations are predictions; PTV at the beginning of a gene is likely more severe than at the end, and in fact, PTVs closer to the HAS1 transcription start site were more strongly associated with stroke.
To increase statistical power on regulatory regions, we performed statistical aggregate tests in genomic windows, enhancers and promoters41,42. Of note, we extended genomic window length from the default to increase statistical power, which however also reduced precision as the causal region might be narrower. We found fourteen genome-wide significant stroke-associated windows with low-frequency variants on 4q33-34.1, of which two replicated for stroke in T1D. According to eQTLs and PCHi-C interactions, 4q33-34.1 variants most likely target GALNTL6, MFAP3L or AADAT. We also discovered a suggestively stroke-associated window through rare variants within LINC01500, which replicated for stroke in T1D. According to PCHi-C, the LINC01500 window targets a promoter of DACT1. Finally, an aggregate of rare variants was suggestively associated with stroke on TRPM2-AS promoter, which nearly replicated in T1D (p-value = 0.053). Importantly, transient receptor melastatin 2 (TRPM2) has been previously associated with ischemic stroke30,31. Our functional cell-based assay validated the TRPM2-AS region promoter activity. However, the most strongly stroke-associated variant, rs753589764, did not associate with TRPM2-AS promoter activity under normal cell culture conditions in HELA cells.
Limitations of the study include the limited statistical power due to moderate sample size at the discovery stage, replication of rare variants with imputed GWAS data, and non-conservative statistical estimates for the rarest variants due to case–control imbalance (≈1:6), especially for the stroke subtypes. We were able to improve the statistical power on exomes by meta-analyzing WES and WGS, and we performed the stroke-subtype specific analyses only for a limited number of suggestive findings to avoid spurious signals due to unstable statistical estimates. To further improve statistical power, we performed statistical aggregate tests on gene exons and on intergenic regions, i.e., enhancers, promoters, and genomic windows. Of note, we studied only transcribed enhancers, and thus, some enhancers could have been missed. We defined promoters with an arbitrarily selected 1,000 bp extension downstream TSS, which may not have always been optimal as the promoter lengths vary. Further limitations are the lack of sequencing-based replication data in individuals with T1D, and that we regarded nominal significance as replication (p-value < 0.05). However, we sought for replication by combining available data sources, i.e., FinnGen (Finnish general population GWAS), UK Biobank (general population WES), and FinnDiane (GWAS and genotyping in Finnish individuals with T1D). Of note, stroke cases were younger and had a shorter diabetes duration than controls in the FinnDiane cohorts; the difference being the most extreme in the discovery cohorts, which may have imposed unsuccessful replication for variants with an age or diabetes duration dependent effect. Importantly, gene burden variant selection criteria did not perfectly match to ours within UK Biobank WES23,27, especially with the low-frequency protein altering variant models, which may explain some unsuccessful gene aggregate replications. Finally, while conducting the analyses in an isolated population has certain advantages for variant discovery, it also raises the question of generalizability of the findings to other populations. In addition to the replication attempted in the UK Biobank, further research is needed to validate our findings in non-Finnish individuals with T1D.
The strengths of this study include a well characterized cohort and comprehensively performed single variant and aggregate analyses both for the coding and non-coding regions of the genome. Stroke is a challenging phenotype to address with ICD codes and many loci associated with rare stroke phenotypes may go unnoticed even with large population-wide genetic studies. We performed analyses for well-defined stroke phenotypes verified by trained neurologists. Furthermore, as we conducted the analyses in specific high-risk individuals from an isolated population, thus with less genetic and phenotypic diversity, we had improved statistical opportunities to identify genetic risk loci.
In conclusion, we studied rare and low-frequency stroke-associated genetic variants with whole-exome or whole-genome sequencing in 1,051 individuals with T1D and report the first genome-wide study on stroke genetics in diabetes. The results highlight 4q33-34.1, SREBF1, and ANK1 for stroke in T1D; and HAS1, UACA, LRRN1, LINC01500, and TRPM2-AS promoter as stroke risk loci that likely generalize to the non-diabetic population. The represented results require future validation with next-generation sequencing in a larger cohort of individuals with T1D.
Methods
Materials
We studied WGS in 571 and WES in 480 non-related individuals with T1D, entailing 112 and 74 stroke cases, respectively (Table 1, Table S1, Fig. S1 and S2). Patients in WGS and WES were non-overlapping. The patient selection for both data sets were originally designed for DKD, such that half of the individuals had severe DKD, and half had no DKD (i.e., normal albumin excretion rate) despite a long duration of T1D21,43. Importantly, this resulted in stroke cases being younger and having shorter diabetes duration than controls, contradictory to presumption. Individuals in the present study were diagnosed with T1D by their attending physician and had diabetes onset age < 40 and insulin initiated within one calendar year from the diabetes diagnosis. Stroke cases were identified for the participants from Finnish registries based on ICD codes until the end of 2017 (Table S16). The phenotypes were verified, and stroke cases classified into ischemic- and hemorrhagic strokes by trained neurologists using medical files and brain imaging data. For individuals without data verified by neurologists available (NWGS = 27, NWES = 2), we considered only the registry data, excluded controls with intermediate stroke phenotypes (e.g., transient ischemic attack), and were unable to classify stroke cases into ischemic- and hemorrhagic subtype. Importantly, we required stroke to have occurred after T1D diagnosis, and controls to have > 35 years of age and > 20 years of diabetes duration. Next-generation sequencing data was processed to GRCh38 reference panel, and variants annotated with SNPEff v.5 software44 (Fig. S12). In variant QC, for autosomal variants, we required Hardy–Weinberg equilibrium (HWE) p-value > 10–10 and variant call rate > 98%; and for X chromosome variants, only variant call rate > 98%. The pipeline is described in Detailed Methods of the Supplementary Information.
Within the FinnDiane study, we have GWAS data for almost the entire cohort, i.e., 6,458 individuals with T1D or their relatives. GWAS data has been previously processed to GRCh37 reference genome. However, we have now lifted the genotyping positions over to GRCh38, re-imputed the data to SISu v3 reference panel, and annotated with SNPEff v.5 software44 (Fig. S13). We attempted replication in individuals with T1D within the FinnDiane GWAS data, non-overlapping to sequencing data (N = 3,945, Table S17 and S18, Fig. S14), and restricted to high imputation quality variants (r2 > 0.80), and by directly genotyping twelve lead variants for replication (N = 3,263, Table S19, Fig. S15). Stroke cases were younger and had shorter diabetes duration than controls in the replication cohorts, comparably to the discovery cohorts, although with a less extreme difference. Of note, variant genotyping was performed with one Agena iPlex multiplexing assay at the Institute for Molecular Medicine Finland, Helsinki, Finland (Table S20), and the genotyping replication limited to individuals within GWAS data in order to perform relatedness adjustment. Stroke phenotype and control criteria within replication in T1D were defined similarly to the WES and WGS data.
Single variant analyses
We analyzed the genome with an additive inheritance model. For variants available in WES and WGS data, we performed score test with rvtests (version 20190205)45, followed by fixed-effect inverse variance based meta-analysis (Total MAC ≥ 5, and MAC ≥ 2 in WES and WGS) with metal (version 20110325)46. For variants available only in one data set we utilized exact Firth regression (MAC ≥ 5)45. Importantly, Firth logistic regression has been suggested the most conservative statistical test for joint rare variant analyses, especially with case–control imbalance, while score test to have the highest statistical power for rare variant meta-analyses47. The additive single variant analyses were adjusted for the calendar year of diabetes onset, sex, and two first genomic data principal components (i.e., minimal adjustment setting), and additionally for DKD, which is one of the most important risk factors of stroke in T1D48. WGS and WES stroke controls are older and have longer T1D duration than cases—contrary to true stroke predisposition—due to next-generation sequencing patient selection optimization for DKD by considering T1D duration. Thus, in order to avoid statistical bias, we adjusted for the calendar year of diabetes onset; a major stroke risk factor correlated with age, T1D duration, and T1D treatment quality.
Gene aggregate analyses
In order to improve statistical power for rare (MAF ≤ 1%) and low-frequency (MAF ≤ 5%) variants, we performed gene aggregate analyses with an optimal unified sequence kernel association test (SKAT-O) meta-analysis with MetaSKAT (version 0.81)49, separately within two distinct classes (Table S21): protein-altering variants and protein-truncating variants i.e., the more severe putative loss-of-function variants50. Importantly, the protein-altering variant class entail protein-truncating variants in addition to variants that alter the amino acid sequence. Of note, SKAT-O maximizes statistical power by optimally combining sequence kernel association test and burden test51. All variable sites (MAC ≥ 1) were accepted into gene aggregate analysis, and the aggregate tests were required to entail at least two variants (Nvariant ≥ 2), with a cumulative MAC (CMAC) across all included variants within the gene ≥ 5. We adjusted the analyses for the calendar year of diabetes onset, sex, and the two first genomic data principal components, and additionally for DKD. We did not report genes with all variants in perfect LD, and inspected individual variant stroke-associations within the genes using the score test fixed-effects meta-analysis45,46. Multiple testing correction, based on the number of tested genes, resulted in significance thresholds of p-value < 4 × 10–6 for PAVs (MAF ≤ 1%: Ngene = 11,954; MAF ≤ 5%: Ngene = 13,069), p-value < 8 × 10–5 for PTVs with MAF ≤ 1% (Ngene = 663), and p-value < 6 × 10–5 for PTVs with MAF ≤ 5% (Ngene = 908). In addition, we investigated stroke-associations for 17 autosomal Mendelian stroke risk genes regardless of CMAC19, and were able to report associations for 13 of them.
Sliding-window and regulatory region aggregate analyses with whole-genome sequencing
To increase statistical power for low-frequency and rare variants on intergenic regions, we performed functionally informed sliding-window analyses, i.e., aggregate analyses within 4,000 base pair (bp) regions (Nvariant ≥ 2, CMAC ≥ 5)—separated by 2,000 bps—with variants statistically weighted according to their rarity and functional importance using STAAR-O (STAAR R package 0.9.6)41,52. Functional importance was defined with Combined Annotation-Dependent Depletion (CADD) data52 using variant MAF (to up-weight rarer variants), pre-computed CADD score, and the first annotation principal component from seven annotation classes (Fig. S16, Table S22), calculated following the guidelines41. Of note, the scores were utilized on the PHRED scale. We adjusted the analyses for the calendar year of diabetes onset, sex, and the two first genomic data principal components.
We studied established regulatory regions, i.e., enhancers and promoters (Nvariant ≥ 2, CMAC ≥ 5), as defined in FANTOM5 cap analysis of gene expression (CAGE) human data reprocessed to the GRCh38 reference genome42, with promoters defined as the transcription start site (TSS) extended to 1,000 bp, and weighted by the variant rarity in PHRED scale. FANTOM5 atlases have been measured with multiple human primary cell lines, tissues, and cancer cell lines53,54. The regulatory regions were analyzed with STAAR R package 0.9.641, by adjusting for calendar year of diabetes onset, sex, and two first genomic data principal components. With low-frequency variants, the multiple testing corrected significance thresholds were p-value < 2.9 × 10–7 for promoters (Nregion = 172,134) and p-value < 2.6 × 10–6 for enhancers (Nregion = 19,472). For rare variants, the thresholds were p-value < 3.5 × 10–7 (Nregion = 141,779) and p-value < 4.3 × 10–6 (Nregion = 11,665), respectively. We did not report regions with all variants in perfect LD.
Replication
Within the FinnDiane GWAS data, we attempted replication of high imputation quality genetic variants (r2 > 0.80) with score test (rvtests 2019020545) and had good statistical power (> 80%) to detect a nominal association with an odds ratio (OR) ≥ 2.5 for additive low-frequency variants (MAF = 1%) (Fig. S17)55. However, for rare variants with MAF = 0.1% and OR < 9, we had only limited power to detect an association even with nominal significance (p-value < 0.05). Thus, we considered nominal significance as the replication threshold (p-value < 0.05). We attempted direct genotyping for replication for twelve variants, but minor allele carriers were observed only for seven of them (Table S20). We performed single variant analyses for the genotyped variants similarly with score test, except for one LRRN1 variant with linear regression and no relatedness adjustment (stats R package 4.2.1) due to lack of alternative allele carriers among individuals with the required relatedness information. Most variants within the aggregate discoveries were rare or ultra-rare (MAF≈0.1%), making replication with imputed genomic data problematic. Nevertheless, we attempted replication within the FinnDiane GWAS data (r2 > 0.80) by including also the directly genotyped variants (SKAT-O, STAAR-O). We performed SKAT-O using GMMAT R package 1.3.2 by imputing missing genotype dosages to mean56, while intergenic aggregate analyses were performed similarly with STAAR R package41. Replication analyses were adjusted comparably to the discovery stage analyses, except that relatedness in replication was accounted for with relatedness matrices instead of genomic principal components (Balding-Nichol’s approximation kinship matrix in single variant analysis and GEMMA relatedness matrix in aggregate analyses)45,57. We attempted replication in the general population for genetic variants from the large-scale population-wide FinnGen project release 6 GWAS data with phenotypes best matching our definitions (https://www.finngen.fi/en) (Table S23), and for the gene aggregate discoveries from UK Biobank summary statistics23,27. Of note, no proxies in LD were found for the lead single variant findings (rs4435704, rs4401420), and thus, we did not consider linkage disequilibrium in replication beyond the traditional imputation approach.
Functional characterization of the genetic variants and regions
We inspected genetic variant characteristics from GTEx Portal, eQTLGen Consortium (p-value < 0.05)28, RegulomeDB58, YUE Lab29, and the Ensemble Variant Effect Predictor33,34,59. Functional characterization of the TRPM2-AS promoter is described in Detailed Methods of the Supplementary Information. In short, we assessed TRPM2-AS expression in three cell lines (HELA, HEK-293, HUVEC) and noted expression in HELA cells. We then assessed the influence of the chromosomal location and the genotype of the most strongly stroke-associated variant (rs753589764) on promoter activity in HELA cells under normal cell culture conditions with a dual-luciferase reporter assay (22 technical repeats).
Detailed methods
Detailed Methods are available in the Supplementary Information.
Ethical approval
The study protocol has been approved by the ethics committee of the Helsinki and Uusimaa Hospital District (491/E5/2006, 238/13/03/00/2015, and HUS-3313-2018), and performed in accordance with the Declaration of Helsinki. All participants gave informed consent before participation.
Data availability
Individual-level data for the study participants are not publicly available, because of the restrictions due to the study consent provided by the participant at the time of data collection. The readers may propose collaboration to research the individual level data with correspondence with the lead investigator. Gene aggregate test stroke summary statistics are provided in the Supplementary Data.
Code availability
We utilized public softwares for the statistical analyses: plink2 (https://www.cog-genomics.org/plink/2.0/), metal (http://csg.sph.umich.edu/abecasis/metal/), rvtests (http://zhanxw.github.io/rvtests/), metaSKAT (https://cran.r-project.org/web/packages/MetaSKAT/), STAAR-O (https://github.com/xihaoli/STAAR), and GMMAT (https://github.com/hanchenphd/GMMAT).
References
Harjutsalo, V., Barlovic, D. P. & Groop, P.-H. Long-term population-based trends in the incidence of cardiovascular disease in individuals with type 1 diabetes from Finland: A retrospective, nationwide, cohort study. Lancet Diabetes Endocrinol. 9, 575–585 (2021).
Sun, H. et al. IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res. Clin. Pract. 183 (2022).
Forlenza, G. P. & Rewers, M. The epidemic of type 1 diabetes: what is it telling us?. Curr. Opin. Endocrinol. Diabetes Obes. 18, 248–251 (2011).
Ståhl, C. H. et al. Glycaemic control and excess risk of ischaemic and haemorrhagic stroke in patients with type 1 diabetes: a cohort study of 33 453 patients. J. Intern. Med. 281, 261–272 (2017).
Janghorbani, M. et al. Prospective study of type 1 and type 2 diabetes and risk of stroke subtypes: the Nurses’ Health Study. Diabetes Care 30, 1730–1735 (2007).
Thorn, L. M. et al. Clinical and MRI features of cerebral small-vessel disease in type 1 diabetes. Diabetes Care 42, 327–330 (2019).
Putaala, J. et al. Diabetes mellitus and ischemic stroke in the young. Neurology 76, 1831–1837 (2011).
Hägg, S. et al. Incidence of stroke according to presence of diabetic nephropathy and severe diabetic retinopathy in patients with type 1 diabetes. Diabetes Care 36, 4140–4146 (2013).
Dichgans, M., Pulit, S. L. & Rosand, J. Stroke genetics: discovery, biology, and clinical applications. Lancet Neurol. 18, 587–599 (2019).
Debette, S. & Markus, H. S. Stroke genetics: discovery, insight into mechanisms, and clinical perspectives. Circ. Res. 130, 1095–1111 (2022).
Mishra, A. et al. Stroke genetics informs drug discovery and risk prediction across ancestries. Nature 611, 115–123 (2022).
Ylinen, A. et al. The impact of parental risk factors on the risk of stroke in type 1 diabetes. Acta Diabetol. 58, 911–917 (2021).
Syreeni, A. et al. Haptoglobin genotype does not confer a risk of stroke in type 1 diabetes. Diabetes 71, 2728–2738 (2022).
Dahlström, E. H. et al. The low-expression variant of FABP4 is associated with cardiovascular disease in type 1 diabetes. Diabetes 70, 2391–2401 (2021).
Vujkovic, M. et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat. Genet. 52, 680–691 (2020).
Antikainen, A. A. V. et al. Genome-wide association study on coronary artery disease in type 1 diabetes suggests beta-defensin 127 as a risk locus. Cardiovasc. Res. 117, 600–612 (2021).
Qi, L. et al. Association between a genetic variant related to glutamic acid metabolism and coronary heart disease in individuals with type 2 diabetes. JAMA 310, 821–828 (2013).
Fall, T., Gustafsson, S., Orho-Melander, M. & Ingelsson, E. Genome-wide association study of coronary artery disease among individuals with diabetes: the UK Biobank. Diabetologia 61, 2174–2179 (2018).
Grami, N. et al. Global assessment of Mendelian stroke genetic prevalence in 101 635 individuals from 7 ethnic groups. Stroke 51, 1290–1293 (2020).
Si, Y., Vanderwerff, B. & Zöllner, S. Why are rare variants hard to impute? Coalescent models reveal theoretical limits in existing algorithms. Genetics 217, 0011 (2021).
Sandholm, N. et al. Whole-exome sequencing identifies novel protein-altering variants associated with serum apolipoprotein and lipid concentrations. Genome Med. 14, 132 (2022).
Hu, Y. et al. Whole-genome sequencing association analyses of stroke and its subtypes in ancestrally diverse populations from trans-omics for precision medicine project. Stroke 53, 875–885 (2022).
Jurgens, S. J. et al. Analysis of rare genetic variation underlying cardiometabolic diseases and traits among 200,000 individuals in the UK Biobank. Nat. Genet. 54, 240–250 (2022).
Locke, A. E. et al. Exome sequencing of Finnish isolates enhances rare-variant association power. Nature 572, 323–328 (2019).
Thorn, L. M. et al. Metabolic syndrome in type 1 diabetes: Association with diabetic nephropathy and glycemic control (the FinnDiane study). Diabetes Care 28, 2019–2024 (2005).
Zinnanti, W. J. et al. Mechanism of metabolic stroke and spontaneous cerebral hemorrhage in glutaric aciduria type I. Acta Neuropathol. Commun. 2, 13 (2014).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK biobank participants. Nature 599, 628–634 (2021).
Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 53, 1300–1310 (2021).
Wang, Y. et al. The 3D genome browser: A web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol. 19, 151 (2018).
Zong, P., Lin, Q., Feng, J. & Yue, L. A systemic review of the integral role of TRPM2 in ischemic stroke: From upstream risk factors to ultimate neuronal death. Cells 11, 491 (2022).
Belrose, J. C. & Jackson, M. F. TRPM2: A candidate therapeutic target for treating neurological diseases. Acta Pharmacol. Sin. 39, 722–732 (2018).
DeBose-Boyd, R. A. & Ye, J. SREBPs in lipid metabolism, insulin signaling, and beyond. Trends Biochem. Sci. 43, 358–368 (2018).
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Park, J. et al. Mutational characteristics of ANK1 and SPTB genes in hereditary spherocytosis. Clin. Genet. 90, 69–78 (2016).
Yan, R. et al. A novel type 2 diabetes risk allele increases the promoter activity of the muscle-specific small ankyrin 1 gene. Sci. Rep. 6, 25105 (2016).
Malik, R. et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat. Genet. 50, 524–537 (2018).
Siiskonen, H., Oikari, S., Pasonen-Seppänen, S. & Rilla, K. Hyaluronan synthase 1: A mysterious enzyme with unexpected functions. Front. Immunol. 6, 1–11 (2015).
Katarzyna Greda, A. & Nowicka, D. Hyaluronidase inhibition accelerates functional recovery from stroke in the mouse brain. J. Neurochem. 157, 781–801 (2021).
Wang, G., Tiemeier, G. L., van den Berg, B. M. & Rabelink, T. J. Endothelial glycocalyx hyaluronan: Regulation and role in prevention of diabetic complications. Am. J. Pathol. 190, 781–790 (2020).
Li, X. et al. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat. Genet. 52, 969–983 (2020).
Abugessaisa, I. et al. FANTOM5 CAGE profiles of human and mouse reprocessed for GRCh38 and GRCm38 genome assemblies. Sci. Data 4, 1–10 (2017).
Sandholm, N. et al. The genetic landscape of renal complications in type 1 diabetes. J. Am. Soc. Nephrol. 28, 557–574 (2017).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).
Zhan, X., Hu, Y., Li, B., Abecasis, G. R. & Liu, D. J. RVTESTS: An efficient and comprehensive tool for rare variant association analysis using sequence data. Bioinformatics 32, 1423–1426 (2016).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: Fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Ma, C., Blackwell, T., Boehnke, M. & Scott, L. J. Recommended joint and meta-analysis strategies for case-control association testing of single low-count variants. Genet. Epidemiol. 37, 539–550 (2013).
Hägg, S. et al. Different risk factor profiles for ischemic and hemorrhagic stroke in type 1 diabetes mellitus. Stroke 45, 2558–2562 (2014).
Lee, S., Teslovich, T. M., Boehnke, M. & Lin, X. General framework for meta-analysis of rare variants in sequencing association studies. Am. J. Hum. Genet. 93, 42–53 (2013).
Rivas, M. A. et al. Effect of predicted protein-truncating genetic variants on the human transcriptome. Science 348, 666–669 (2015).
Lee, S., Wu, M. C. & Lin, X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13, 762–775 (2012).
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
Fantom Consortium. A promoter-level mammalian expression atlas. Nature 507, 462 (2014).
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Moore, C. M., Jacobson, S. A. & Fingerlin, T. E. Power and sample size calculations for genetic association studies in the presence of genetic model misspecification. Hum. Hered. 84, 256–271 (2019).
Chen, H. et al. Efficient variant set mixed model association tests for continuous and binary traits in large-scale whole-genome sequencing studies. Am. J. Hum. Genet. 104, 260–274 (2019).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Boyle, A. P. et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Pruim, R. J. et al. LocusZoom: Regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).
Hahne, F. & Ivanek, R. Visualizing genomic data using gviz and bioconductor. In Statistical Genomics: Methods and Protocols Vol. 1418 (eds Mathé, E. & Davis, S.) 335–351 (Humana Press, New York, 2016).
Acknowledgements
We are indebted to the late Carol Forsblom (1964–2022), the international coordinator of the FinnDiane Study Group, for his considerable contribution. The skilled technical assistance of Heli Krigsman, Hanna Olanne, Maikki Parkkonen, Mira Rahkonen, Anna Sandelin, and Jaana Tuomikangas is gratefully acknowledged. We also want to acknowledge all the physicians and nurses at each FinnDiane center participating in the recruitment and characterization of the individuals with T1D (Table S24) and the FinnDiane participants. In addition, we acknowledge the participants and investigators of the FinnGen study. We acknowledge the ELIXIR Finland node, hosted at the CSC – IT Center for Science for ICT resources, enabling the WES and WGS data processing. Finally, we want to acknowledge Bert Vogelstein and Jukka Kallijärvi for material provided for the in vitro promoter experiments: pBV-Luc plasmids (Bert Vogelstein) and renilla control plasmid (Kallijärvi lab, Folkhälsan Research Center). We utilized data provided by GTEx. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data reported here were obtained from the GTEx Portal on 01/04/2022: https://gtexportal.org/home/. The full list of doctors and nurses examining the participants is supplied in Supplementary Table S24.
Funding
This work was supported by grants from Folkhälsan Research Foundation; Wilhelm and Else Stockmann Foundation; “Liv och Hälsa” Society; Sigrid Juselius Foundation (220027); Helsinki University Central Hospital Research Funds [TYH2018207]; Novo Nordisk Foundation [NNF OC0013659 and NNF23OC0082732], Academy of Finland [299200 and 316664]; European Foundation for the Study of Diabetes (EFSD) Young Investigator Research Award funds; an EFSD award supported by EFSD/Sanofi European Diabetes Research Programme in Macrovascular Complications; Finnish Foundation for Cardiovascular Research; Finnish Diabetes Research Foundation; Aarne Koskelo Foundation; and the Ida Montini Foundation. The funders had no role in the study design, collection, analysis, interpretation and writing of the manuscript.
Author information
Authors and Affiliations
Consortia
Contributions
A.A.A. analyzed the data, wrote the manuscript, and contributed to interpretation of the data, conception and study design, and pre-processing of whole-exome sequencing data. J.H. pre-processed the whole-genome sequencing data and contributed to computational analyses and conception and study design. N.S. contributed to acquisition of phenotypic and genotypic data, conception and study design, manuscript writing and interpretation of data. A.Ku. performed laboratory experiments. A.S, E.K., A.Ky., and A.P. contributed to acquisition and data processing of genetic data. S.H.-H. and A.Y. contributed to acquisition of phenotypic data. J.P., L.M.T., V.H. and P.-H.G. contributed to interpretation of data, acquisition of phenotypic data, and to conception and study design. J.H., A.Ku., A.S., S.H.-H., A.Y., E.K., A.Ky., A.P., J.P., L.M.T, V.H., P.-H.G., and N.S. revised the manuscript critically for important intellectual content. All authors gave final approval of the version to be submitted and any revised version.
Corresponding authors
Ethics declarations
Competing interests
P.-H.G. has received investigator-initiated research grants from Eli Lilly and Roche, is an advisory board member for AbbVie, Astellas, AstraZeneca, Bayer, Boehringer Ingelheim, Cebix, Eli Lilly, Janssen, Medscape, Merck Sharp & Dohme, Mundipharma, Nestlé, Novartis, Novo Nordisk and Sanofi; and has received lecture fees from AstraZeneca, Bayer, Boehringer Ingelheim, Eli Lilly, Elo Water, Genzyme, Merck Sharp & Dohme, Medscape, Novartis, Novo Nordisk, PeerVoice, Sanofi, and Sciarc. The other authors declare that they have no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Antikainen, A.A., Haukka, J.K., Kumar, A. et al. Whole-genome sequencing identifies variants in ANK1, LRRN1, HAS1, and other genes and regulatory regions for stroke in type 1 diabetes. Sci Rep 14, 13453 (2024). https://doi.org/10.1038/s41598-024-61840-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-61840-7
- Springer Nature Limited