Abstract
Background
Small RNAs regulate gene expression in species across the tree of life. miRNAs, which impact a variety of cellular and physiological processes ranging from development and stress adaptation to host defense, are one of the best characterized classes of small RNA. Many miRNAs are produced from longer non-coding transcripts generated from host genes via a series of RNA cleavage reactions. The location of a small RNA within a host gene can shape the processing of the mature small RNA. For example, a type of miRNAs derived from host gene intronic sequence, referred to as miRtrons, are Drosha-independent and reliant on splicing for biogenesis. Relatedly, processing of a small RNA from an exon of a protein-coding mRNA, in principle, may destabilize it and compromise translation of the host gene. Prior to extensive transcriptome analysis, informatics analyses identified six human miRNAs embedded in exons of protein-coding genes and experimental studies have characterized additional anecdotal examples. Still, whether protein-coding mRNAs encoding small RNAs represent an appreciable class of host genes given the now recognized complexity of the transcriptome is unclear.
Results
Our analysis finds 201 small RNAs (118 human and 83 mouse) encoded by expressed exons of protein-coding genes (5’-UTR, CDS, 3’-UTR). Forty-six of these cases (29 human and 17 mouse) are also present in MirGeneDB which includes the most up-to-date miRNA classifications. Many of these small RNAs are poorly characterized with 96% of the protein-coding host gene relationships identified here not previously known. Furthermore, the identification of nearly fifty human and mouse small RNAs embedded within coding exons of canonical ORFs suggests that overlapping hybrid genes might be more common than previously appreciated in higher organisms. Expression analysis for a subset of these small RNAs indicates that many display differential expression across human tissues with the pattern correlating significantly with the expression of the candidate protein-coding host gene.
Significance
Overall, our analysis suggests that the number of protein-coding transcripts serving as host genes is greater than previously recognized. Our small RNA host gene classifications may serve as a resource to shed new light on small RNA biology, specific host genes, and gene regulation.
Similar content being viewed by others
Background
Small RNAs like microRNAs (miRNAs) have key roles in gene regulation [1]. miRNAs are ~ 22 nucleotides (nt) in length and generally act by a post-transcriptional mechanism involving base-pairing interactions with their target mRNA. miRNA target sites are typically located in the 3’-UTRs of protein-coding mRNAs [2, 3] and referred to as miRNA response elements (MREs). miRNAs are abundant across eukaryotic genomes ranging from plants to animals. miRNA-like species are also encoded by fungal [4] and viral genomes [5,6,7]. The activities of other processed small RNAs such as tRNA fragments have also been associated with essential cellular activities including the repression of endogenous retroviruses [8] and metabolic regulation [9].
Canonical miRNA biogenesis is regulated via serial processing by two distinct RNases. miRNAs are initially produced as larger mRNAs—termed primary miRNAs (pri-miRNAs)—transcribed by either RNA pol II or RNA Pol III. Many pri-miRNA encoding loci are termed host genes. pri-miRNAs are recognized and processed in the nucleus by the microprocessor complex consisting of the RNase III endonuclease Drosha and its cofactor, DGCR8 [10,11,12,13]. Drosha cleavage of the pri-miRNA produces the precursor miRNA (pre-miRNA), an ~ 80 base pair long RNA hairpin species with a two nucleotide 3’-overhang structure. Pre-miRNAs are exported to the cytoplasm by exportin 5 [14] where they are cleaved by the RNase DICER to generate miRNA duplexes 18–22 nts in length [15, 16]. Subsequently, one strand of the miRNA duplex referred to as the guide strand is loaded into an Argonaute (Ago) protein. The primary Argonaute protein in humans is Ago2. Ago2 loaded with a miRNA is defined as RNA-induced Silencing Complex (RISC) [17, 18]. RISC is the effector miRNA complex that targets genes for silencing via translation inhibition or mRNA destabilization.
miRNA genes typically reside in both intergenic as well as intragenic regions of our genome [19, 20]. Intergenic miRNAs are transcribed by RNA polymerase II or III as independent transcription units. Intragenic miRNAs on the sense strand are frequently transcribed as part of the host gene. This bicistronic gene structure implies co-regulation of the miRNA and the host gene. A subclass of miRNAs that reside in introns on the sense strand, which are known as miRtrons, are produced via splicing independent of Drosha-processing [21, 22]. There are also instances of pre-miRNAs overlapping exon–intron junctions that are produced by either Drosha or pre-mRNA splicing [23].
To date, majority of exonic miRNAs are known to be almost exclusively encoded by exons of long non-coding RNAs [19, 20, 24]. Early miRNA studies [19, 20], which were prior to the explosion of next-gen RNA-seq studies and consortia like GTEx [25], identified a dearth of exonic miRNAs in protein-coding genes. A subsequent analysis in 2013 identified four human and eleven mouse miRNAs – fifteen total—encoded by exons of protein-coding genes [24] (Table 1). Furthermore, experimental studies have identified and characterized some miRNAs encoded by exons of protein-coding genes (Fig. 1). Examples include miR-198 encoded by the 3’-UTR of FSTL1 [26], miR-3618 encoded by the 5’-UTR of DGCR8 [27], miR-1306 in the first coding exon of DGCR8, and miR-147b encoded by the 3’-UTR of C15orf48 also known as MISTRAV [28, 29]. In general, small RNAs processed from protein-coding transcripts are especially interesting as their processing is predicted to destabilize the host mRNA thereby inactivating any activities encoded by the intact transcript [26,27,28]. Collectively, the gene structure of these potential bicistronic transcripts and experimental studies noted (Table 1) suggest post-transcriptional co-regulation of exonic miRNAs and their host genes.
Examples of miRNAs that reside in exons of protein-coding genes. A FSTL1 is a known miRNA protein-coding host gene that encodes primate-specific mir-198 in its 3’-UTR. Both gene products are associated with wound healing [26]. B The miRNA microprocessor cofactor DGCR8 is a known miRNA host gene that encodes a miRNA in its 5’-UTR (mir-3618) and first coding exon (mir-1306) [27]. Processing of these miRNAs is linked to the regulation of DGCR8 activity. C The C15orf48 locus is a known miRNA host gene, which encodes a protein that shapes antiviral responses termed MISTRAV, and encodes mir-147b in its 3’-UTR [28]
Given the exponential growth of transcriptome data, we sought to identify small RNAs that reside in exons of protein-coding genes in the greatly expanded and curated transcript datasets. In human and mouse genomes, we find 201 small RNAs embedded within exons of protein-coding genes; of which, 23% (46 small RNAs) display characteristics common to miRNAs based on MirGeneDB. The majority (96%) of the host gene-small RNA relationships have not been documented. We identify nearly fifty human and mouse small RNAs that reside in coding exons of protein-coding host genes. Interestingly, several of these candidate protein-coding host genes have established roles in immunity like the major histocompatibility cofactor B2M [30, 31] and the antiviral factor ZAP [32, 33]. Our analysis generates a resource of putative host genes including protein-coding transcripts for human and mouse small RNAs.
Materials and methods
Bioinformatic analysis of exonic small RNAs
Genomic coordinate data were obtained from UCSC genome browser table browser (https://genome.ucsc.edu/cgi-bin/hgTables; last accessed Aug. 07, 2024). NCBI RefSeq [34] track was selected. GRCh38/hg38 and GRCm38/mm10 were selected for human and mouse, respectively. For small RNA genetic coordinates, the data were obtained from miRBase 22 (https://mirbase.org/download/; last accessed Aug 07, 2024). gff3 files were downloaded for each species (Human:hsa.gff3 and Mouse: mmu.gff3). RefSeq mRNA data were used to map miRBase annotated pre-miRNAs to protein-coding genes. All the data were processed using pandas (Python library) and bash commands. BEDTools [35] was used to classify miRBase annotated pre-miRNAs as exonic, intronic, intergenic, and strandedness. Any miRBase annotated pre-miRNA sequence completely overlapping with an exon was assigned to exon-derived small RNAs, whereas intron-derived small RNA assignment required all coordinates residing within an annotated intron. Non-coding transcripts were defined as mRNAs having the same coordinates for CDS start and CDS end (Fig. 2).
Pipeline for assignment of small RNA and host-gene relationships. A Schematic of analysis integrating genomic annotations and small RNAs present in miRBase and MirGeneDB. Final output files are colored in blue, and files that are further processed are colored in red. B-D Schematics illustrating inclusive classification logic present in the analysis pipeline to identify as many instances of small RNAs residing in exons of protein-coding mRNAs. See Supplementary file 1 for specific examples. B If the location of the pre-miR overlapped with exonic and intronic transcriptional units in the same orientation, the small RNA was assigned as exonic. C If the small RNA location overlapped with a locus that generated protein-coding and non-coding transcripts, the small RNA was assigned protein-coding. D If the small RNA overlapped with an alternative upstream or downstream exon of a transcribed unit, the small RNA was assigned exonic instead of intergenic
Each small RNA was only assigned to one class; however, class assignment was inclusive and not exclusive relative to identifying instances of small RNAs residing in exons of protein-coding mRNAs. As needed, additional curation was carried out to obtain small class estimates when the host gene is produced from a locus that generates protein-coding and non-coding transcripts which may also include upstream or downstream exons as well as additional unique cases. Some scenarios and specific examples noted here (Fig. 2B, C, D, Supplementary file 1). Instance 1: A sense-strand small RNA in an intron of a protein-coding locus that produces coding and non-coding transcripts, the small RNA was assigned intronic and coding (example: hsa-mir-26b/CTDSP1). Instance 2: A sense-strand small RNA in a downstream alternative exon of a non-coding transcript that shares some exonic overlap with protein-coding transcripts but lack the alternative exon; the small RNA was assigned non-coding and exonic (hsa-mir-6129/ZNF652). Instance 3: A protein-coding locus that generates non-coding transcripts with additional upstream or downstream exons where the sense-strand small RNA resides in the upstream/downstream intron; the small RNA was assigned intronic and non-coding (hsa-mir-4786/NDUFA10/NR_136158.2). Instance 4: A non-coding mRNA derived from a fusion transcript from two protein-coding genes where the small RNA is an intron; assignment was protein-coding and intronic (hsa-mir-628/DNAAF4-CCPG1/NR_037923.1).
To gain additional insights, the small RNA classes assigned using the miRBase entries were further analyzed using entries in MirGeneDB (mirgenedb.org). MirGeneDB [36,37,38,39] contains the most extensively curated set of miRNAs relative to miRBase, which has limitations regarding its usage for miRNA classifications as it includes instances of tRNA, rRNA, RNA derived from other classes, and potentially other by-products originating from transcriptional noise or RNA quality control issues [36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51]. All codes are deposited in Github (https://github.com/tyronchang1/Exonic-microRNA-analysis).
Sequence analysis
Sequences were retrieved from NCBI and aligned using Muscle in Geneious Prime (2025.0.3; www.geneious.com). Sequence accession numbers from NCBI (last accessed Jan. 1, 2025) are included in supplemental files and figures. Sequence logos were generated using (https://weblogo.berkeley.edu/logo.cgi). Gene diagrams and conservation tracks were downloaded from the UCSC genome browser (https://genome.ucsc.edu/index.html) and edited in Adobe Illustrator.
Retrieval of known disease variants for candidate protein-coding host genes
Variants for candidate protein-coding host genes were downloaded from ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/; last accessed Dec. 16, 2024) using the following filters: review status (multiple submitters) and molecular consequence (frameshift, missense, nonsense, splice site, and UTR). Our analysis and reporting only involved variants annotated as either pathogenic or likely pathogenic (Table 2, Supplementary File 6).
GTEx analysis of exon-derived human small RNAs present in MirGeneDB and their candidate protein-coding host genes
Expression data for both exonic small RNAs and their host genes were retrieved from the GTEx v10 bulk RNA-seq dataset via the GTEx Portal (https://gtexportal.org/home/downloads/adult-gtex/bulk_tissue_expression). All datasets were last accessed on 05/26/25. miRNA expression: miRNA_TPM_matrix_PORTAL_2025_03_17.txt.gz. Protein-coding host gene expression: GTEx_Analysis_v10_RNASeQCv2.4.2_gene_tpm.gct. For downstream analysis, average TPM values for exonic small RNAs identified by our analysis present in MirGeneDB [39] and their candidate protein-coding host genes were computed across tissues using NumPy [52], followed by log₂ transformation. Data wrangling was performed with pandas (https://pandas.pydata.org/), and visualizations were generated with Matplotlib [53] and Seaborn [54].
Linear regression analysis
Mean log₂TPM values for human small RNAs and candidate protein-coding host genes were calculated using NumPy [52]. Small RNAs analyzed were selected based on their expression profile in the GTEx data (Fig. 8), the identity of the host gene, and their presence in MirGeneDB (Supplementary File 4). Linear regression analyses for each small RNA–host gene pair were performed with the SciPy package [55]. p-values were obtained from two-sided t-tests implemented in SciPy. All plots were generated using Matplotlib and Seaborn.
Results
Several small RNAs reside in exons of protein-coding transcripts
While there are some known instances of miRNAs encoded by exons of protein-coding genes (Fig. 1A – C) [26,27,28], the overall number identified to date is limited to a few cases (Table 1). As similar transcripts with a bicistronic structure potentially serve as resource to provide insights into specific host genes, small RNAs, and gene regulation, we set out to identify additional potential instances of protein-coding host genes encoding exonic small RNAs (Fig. 2). We started our analysis with miRBase [56], which contains well-studied and poorly characterized small RNAs. While miRBase is also known to contain entries that are derived from sources other than miRNAs such as other subclasses of RNA like tRNA, rRNA, and transcriptional noise [36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51], we reasoned that subsequent filtering using MirGeneDB [39] could aid in discriminating miRNAs. MirGeneDB contains the most up-to-date curation of miRNAs. In addition, small RNAs processed from a larger protein-coding transcript could still potentially be of general interest independent of the small RNA fate or activity post-processing as they may point to an ill-defined regulatory element. To this end, we mapped all annotated human (hsa.gff3) and mouse pre-miRNAs (mmu.gff3) from miRBase to the corresponding genomes (Fig. 3A). For aspects of identifying host genes, small RNAs present in the miRBase dataset may be referred to using an annotation which may include a miR prefix herein.
Informatics pipeline to identify small RNAs encoded by exons of protein-coding transcripts. A Human and mouse small RNAs were downloaded from miRBase 22.1 (mirbase.org) [56] and mapped to the corresponding genome using BEDTools [35]. For additional analysis, miRNAs were retrieved from MirGeneDB [36,37,38,39]. B Intragenic and intergenic miRNAs were identified based on their overlap with RefSeq mRNAs. C Intragenic miRNAs were screened for their orientation relative to the host mRNA and retained for further processing. D Same-sense miRNAs were classified into coding and non-coding host-genes followed by subclassification into exonic versus intronic. E Identified exonic mRNAs mapping to protein-coding mRNAs were further analyzed in this study. Classifications including candidate host genes and transcript accession numbers are in Supplementary File 2 for human small RNAs from miRBase, Supplementary File 3 for mouse small RNAs from miRBase, Supplementary File 4 for human miRNAs from MirGeneDB, and Supplementary File 5 for mouse miRNAs from MirGeneDB. Host loci with multiple transcripts were filtered to only one transcript to obtain estimated host gene-miRNA class counts. Due to alternative splicing and multiple transcript architectures, classifications should be considered inclusive and not exclusive (see methods). Hs – Homo sapiens. Mm – Mus Musculus
To map pre-miRNAs to protein-coding transcripts, we used RefSeq mRNA data [34]. RefSeq consists of ~ 200,000 human and ~ 200,000 mouse protein-coding and non-coding transcripts; many of which are alternatively-spliced mRNAs. To identify small RNAs potentially residing in protein-coding exons, we first classified human and mouse small RNAs from miRBase and MirGeneDB into separate successive groups based on their genomic location and strandedness relative to host genes: Group 1) intragenic (Fig. 3B), Group 2) intragenic and sense orientation relative to host gene (Fig. 3C), Group 3) sense orientation in protein-coding mRNAs (Fig. 3D). Group 4 was sense orientation in exons of protein-coding mRNAs with no overlap with host gene introns (Fig. 3D, E, Fig. 4); hereafter referred to as exonic small RNAs.
Numerous small RNAs, with a subset being Drosha-dependent, have overlapping coordinates with coordinates of exons for protein-coding mRNAs. A The distribution of 118 human small RNAs from miRBase (29 miRNAs from MirGeneDB, light blue) mapping to UTR and CDS exons of protein-coding mRNAs. B The distribution of 83 mouse small RNAs from miRBase (17 miRNAs from MirGeneDB, light blue) mapping to UTR and CDS exons of protein-coding mRNAs. The percent represents the number of small RNAs in that type of exon relative to the total number of small RNAs in exons of protein-coding genes. The number of miRNAs in each exon class is in parentheses. C) Human small RNAs from miRBase, with those also present in MirGeneDB highlighted in light blue, identified by our analysis that map to exons of protein-coding mRNAs and display published evidence for Drosha-dependent or -independent processing [58, 59]. Pre-miRNAs are listed in alphabetical order relative to their host gene listed in parentheses. CDS – coding DNA sequence; UTR – untranslated region; hsa – Homo sapiens
The small RNA-host gene assignments that follow are presented with the percent and number of small RNA entries from miRBase followed by the number of entries overlapping with MirGeneDB (Fig. 3). The percentages are relative to the total number of starting pre-miRNAs in miRBase, 1913. We found that of the 1913 human pre-miRNAs in miRBase, 79.4%; 1519 small RNAs (414 in MirGeneDB) reside in intragenic regions. In agreement with previous observations and suggestive of co-transcriptional regulation, more than half of all miRBase human pre-miRNAs [64.6% (1235); 355 in MirGeneDB] are in the same orientation as the resident gene [19, 20] (Supplementary File 2, Supplementary File 4). Of the 1235 miRBase human small RNAs in intragenic regions, 904 small RNAs of all miRBase human pre-miRNAs are located in annotated protein coding transcripts – exonic plus intronic (47.2%)(205 in MirGeneDB). Interestingly, ~ 6%; 118 of all miRBase human small RNAs (29 in MirGeneDB) analyzed reside in exons of protein-coding transcripts in the same orientation. Consistently, our analysis using miRBase identified all five previously reported miRNAs residing in exons of protein-coding transcripts (hsa-miR-21, hsa-miR-147b, hsa-miR-198, hsa-mir-1306, hsa-mir-3618) with 96% of the host gene-small RNA relationships not well-appreciated. Note, hsa-miR-198 and hsa-mir-3618 are not present in MirGeneDB.
The other human pre-miRNA/host gene subclass numbers identified are as follows (Fig. 3): intronic and coding transcripts, 41.1%, 786 miRBase pre-miRNAs (176 in MirGeneDB), exonic and non-coding mRNAs 4.3%, 83 miRBase pre-miRNAs (54 in MirGeneDB), and intronic and non-coding mRNAs [9.6%, 183 miRBase pre-miRNAs (90 in MirGeneDB)]. Our analysis of the mouse genome identified comparable distributions of these small RNA-host gene relationship classes (Fig. 3B-D). Specifically, of the 1226 mouse pre-miRNAs in miRBase, 79%, 968 pre-miRNAs (274 in MirGeneDB) are intragenic with 67.0%, 822 miRBase pre-miRNAs (230 in MirGeneDB) of all mouse small RNAs in the same orientation as the host gene (Supplementary File 3, Supplemental File 5). Notably, ~ 7%, 83 miRBase pre-miRNAs (17 in MirGeneDB) reside in exons of protein-coding genes. Thus, we have identified several new cases where the genomic coordinates of small RNAs overlap with coordinates for exons of protein-coding mRNAs.
Small RNAs map to both untranslated and coding exons of protein-coding genes
We next assessed where in protein-coding transcripts exonic human and mouse small RNAs are located—5’-UTR, CDS, and 3’-UTR. We found that 30 (25.4%) human miRBase pre-miRNAs reside in 5’-UTR sequences (5 in MirGeneDB), 25 (21.2%) in CDS exons (4 in MirGeneDB), and 58 (49.2%) in 3’-UTR sequences (20 in MirGeneDB) (Fig. 4A, Supplementary File 2, Supplementary File 4). During this analysis, we also identified four pre-miRNAs overlapping the coding sequence and 3’-UTR (FAM89A/hsa-mir-1182, TBC1D17/hsa-mir-4750, ATF5/hsa-mir-4751, RPL28/hsa-mir-6805), and one instance of a pre-miRNA overlapping the 5’-UTR and coding sequence (HSP90B1/hsa-mir-3652); none of these instances were present in MirGeneDB. pre-miRNAs residing in human CDS exons include factors associated with development like HOXD1 (hsa-mir-7704) and stress adaptation such as the HSP90 co-chaperone CDC37 (hsa-mir-1181) [57] (Fig. 5, Supplementary File 2, Supplementary File 4). 27%, 32 of these human exonic small RNAs in protein-coding host genes (15 in MirGeneDB) display evidence in published work of Drosha-dependent processing [58, 59] and eight exonic miRNAs display evidence for processing independent of Drosha (0 in MirGeneDB (Fig. 4C)).
Examples of human and mouse small RNAs mapping to different classes of exons of protein-coding transcripts. A Examples of miRBase pre-miRNAs that map to 5’-UTR exons of protein-coding mRNAs. B miRBase pre-miRNAs that map to UTR and CDS sequences. C Examples of miRBase pre-miRNAs that map to 3’-UTR exons. D All identified miRBase human pre-miRNAs that map entirely within a coding (CDS) exon. E All identified miRBase mouse pre-miRNAs that map entirely within a coding (CDS) exon. Pre-miRNAs are listed in alphabetical order relative to their host gene in parentheses with those also present in MirGeneDB highlighted in light blue. Classifications including candidate protein-coding host genes and transcript accession numbers are in Supplementary File 2 and Supplementary File 4 for human small RNAs and Supplementary File 3 and Supplementary File 5 for mouse small RNAs. hsa – Homo sapiens; mmu – Mus musculus
For the 83 mouse miRBase pre-miRNAs in exons of protein-coding genes in the same orientation, we identified 26 (31.3%) pre-miRNAs in annotated 5’-UTR sequence (4 in MirGeneDB), 21 (25.3%) in CDS exons (4 in MirGeneDB), and 32 (38.6%) in 3’-UTR sequence (9 in MirGeneDB)(Figs. 4B and 5, Supplementary File 3, Supplementary File 5). Similar to human, we identified three mouse pre-miRNAs overlapping 3’-UTR and coding sequences (Clcn7/mmu-mir-12188, Scd2/mmu-mir-5114, Rps6ka4/mmu-mir-5046) and one miRNA overlapping the 5’-UTR and coding sequence (Tlcd1/mmu-mir-7653); none of these instances were present in MirGeneDB. Eight miRBase pre-miRNAs in exons across seven host genes are common to human and mouse [mir-147b (C15orf48/AA467197), mir-24–1 (AOPEP), mir-21 (VMP1), mir-3618 (DGCR8) with four in CDS: mir-1306 (DGCR8), mir-935 (CACNG8), mir-671 (CHPF2), and mir-1199 (MISP3)]. Five out of seven of the aforementioned small RNAs are present in MirGeneDB with mir-1199 (MISP3) and mir-3618 (DGCR8) not present.
Next, we examined evolutionary conservation for select exonic small RNAs based on their candidate host gene using sequences available in the database and identified by BLAST. Here, we were interested in sequence conservation not only across the predicted 22-mer but also the predicted seed sequence (nucleotides: 2–8), which is known to mediate base pair interactions with target mRNAs at MREs [1, 2, 60, 61]. This analysis revealed varying levels of sequence conservation including turnover reflecting appreciated evolutionary histories for small RNAs like miRNAs [24, 62, 63]. First, we examined hsa-mir-10393 which is located in the 3’-UTR of B2M, a protein that is a key component of the MHC complex [30, 31, 64] (Fig. 6A, B). Our analysis showed that hsa-miR-10393-3p displays sequence conservation, particularly in the seed sequence, in mammals but has seemingly degenerated in mice (Fig. 6A). miR-21-5p, which has been well-studied [65, 66], resides in the 3’-UTR of the autophagy protein VMP1. In this instance, the entire miR-21-5p sequence seems well-conserved in vertebrates (Fig. 6C, D). Another interesting example is miR-10399 (Fig. 6E and F), which is encoded by the antiviral factor ZAP [32, 33]. hsa-mir-10399 is conserved in mammals but the predicted seed sequence has also diverged in mice but not in cat or armadillo.
Small RNAs identified that map to exons of protein-coding transcripts display evolutionary conservation. A mir-10393 maps to the 3’-UTR of B2M, a component of MHC. B Predicted mir-10393 hairpin. C mir-21 maps to the 3’-UTR of VMP1, an autophagy factor [65, 66]. D Predicted mir-21 hairpin structure. Emir-10399 maps to the 3’-UTR of ZAP, an antiviral factor [32, 33]. F Predicted mir-10399 hairpin. The most abundant small RNA strand in miRBase is colored red in the predicted hairpin. The 100 vertebrate conservation track (cons.) is derived from the UCSC genome browser. The sequence logo was generated using the aligned sequences shown. In the alignment, the small RNA sequence is in black and the predicted seed sequence (nucleotides 2–8) in red
Other small RNA-host gene relationships potentially of interest included host protein-coding genes that have known disease variants as these genes are often well-studied and biomedically relevant (Table 2, Supplementary File 6). One candidate host gene we identified is PTEN-induced kinase 1 (PINK1), which is a master regulator of mitochondrial quality control, is associated with Parkinson’s disease [67, 68]. The PINK1 gene encodes the ill-defined hsa-mir-6084 in its first-coding exon (Fig. 7A, B). miR-6084-3p, which is the most highly expressed strand according to miRBase, is highly conserved including 100% sequence identity in the predicted seed sequence to marsupials. Another interesting candidate host gene is TMEM94 which encodes hsa-mir-6785 (Fig. 7C, D). TMEM94, also known as ERMA, is an ER-resident protein that is a P-type ATPase transporter important in Mg2 + uptake [69]. hsa-miR-6785-5p is highly conserved to Old World Monkeys. Mutations in TMEM94 are associated with neurodevelopmental delay in multiple unrelated individuals [70, 71]. Another noteworthy example is hsa-mir-4709, which is encoded in the 3’-UTR of Niemann-Pick disease, type C2 (NPC2) gene (Fig. 7E, F). NPC disease is a lysosomal storage disorder [72]. miR-4709-3p is highly conserved to New World Monkeys. While gene ontology analysis of candidate protein-coding host genes did not reveal any enrichment for biological process, molecular function, or cellular component, the noted examples do indicate that several of the host genes have been implicated in immune functions and stress responses in published work. Altogether, we have identified numerous predicted pre-miRNAs, many of which are also in MirGeneDB, that map to exons of protein-coding transcripts with a subset displaying evidence of Drosha-dependent processing and evolutionary conservation patterns consistent with a functional seed sequence.
Examples of small RNAs that map to exons of protein-coding genes implicated in human disease. A miR-6084 maps to the first coding exon of PINK1, a mitochondrial quality control factor mutated in Parkinson’s disease [67, 68, 74]. B Predicted mir-6084 hairpin. C mir-6785 maps to a coding exon of TMEM94, ER protein that acts as a Mg2 + transporter [69], which is mutated in a type of rare syndromic intellectual disability [70, 71]. D Predicted mir-6785 hairpin. E mir-4709 maps to the 3’-UTR of NPC2, which encodes a gene that regulates cholesterol transport. Mutations in NPC2 are associated with a lysosomal storage disorder [72]. F Predicted mir-10393 hairpin. The most abundant mature small strand in miRBase is colored red in the predicted hairpin panels. The 100 vertebrate conservation track (cons.) is derived from the UCSC genome browser. The sequence logo was generated using the sequences shown. In the alignment, the small RNA sequence is in black and the predicted seed sequence (nucleotides 2–8) in red. snubM – snub-nosed monkey
Expression of several exonic small RNAs across human tissues correlates with expression of their corresponding candidate protein-coding host genes
To further characterize the small RNAs of interest here, we analyzed the expression for exonic small RNAs for which data was available (25 human pre-miRNAs; 45 total miRNAs) for instances present in both the miRBase and MirGeneDB subsets (Fig. 8, Supplementary File 4, Supplementary File 7). Specifically, we leveraged data from the Gene Expression Tissue Expression (GTEx) resource consisting of expression data across fifty-seven human tissues including bladder, pancreas, liver, spleen, among others [25]. The analysis included both 5p- and −3p strands if the data were present. This analysis showed that several of these exonic small RNAs are constitutively expressed in many of tissues such as hsa-mir-197-3p (GNAI3), hsa-mir-652-3p (TMEM164), and hsa-mir-423-5p (NSRP1) whereas others like hsa-mir-935 (CACNG8) displayed more variable expression across tissues.
RNA expression levels of mature exonic small RNAs across human tissues. Mean expression levels (log₂ TPM) of 45 mature exonic small RNAs present in both miRBase and MirGeneDB across 57 human tissues. Each tissue expression value represents the mean expression across multiple individuals. 5p/3p labels denote the small RNA arm of origin. Small RNAs with similar expression patterns are grouped by clustering. TPM: transcripts per million. Data downloaded for analysis from GTEx [25]
To examine whether exonic small RNA expression correlated with expression of their candidate host gene, we performed regression analysis using expression data from GTEx (Fig. 9, Supplementary File 8). A correlation would be consistent with the protein-coding gene serving as a precursor transcript for the small RNA. Specifically, we analyzed expression patterns for ten exonic small RNAs and putative host gene pairs based on their tissue expression profile (Fig. 8, Supplementary File 7). For small RNAs where data was available for both 5p- and −3p strands (8 out of 10), one strand was often noticeably expressed at levels greater than the other strand; a known hallmark of miRNAs. Expression for six out of the ten small RNAs-host gene pairs examined displayed statistically significant correlation with host gene expression (p < 0.01: hsa-mir-197-3p (GNAI3); p < 0.0001: hsa-mir-24–1-5p (AOPEP), hsa-mir-935 (CACNG8), hsa-mir-21-3p (VMP1), hsa-mir147b-3p (C15orf48), and hsa-mir-149-5p (GPC1). This analysis suggests that a subset of the protein-coding genes identified here display an expression pattern consistent with the gene serving as a small RNA host gene.
Correlation between exonic small RNA and candidate protein-coding host gene expression across human tissues. Linear regression models depict the relationship between selected exonic small RNAs, which are present in both miRBase and MirGeneDB, and their host genes across 54 human tissues. Each point represents the mean log₂TPM of both the small RNA and putative host gene in a given tissue. Small RNA strands are color-coded by arm of origin: blue for 5p, red for 3p, and black when unspecified. p-values and R2 values for each regression are shown within each panel. TPM: transcripts per million
Discussion
Small RNAs like miRNAs are key players in gene regulation and their dysregulation is linked to a range of human diseases [73]. The biogenesis of miRNAs is associated with the nature of their transcriptional unit [20] and their location in it, whether it be exonic or intronic [21, 22]. The coding potential of the host gene has implications for exonic small RNA processing as exemplified by FSTL1/miR-198 [26], DGCR8/miR-1306 and miR-3618 [27], and MISTRAV(C15orf48)/miR-147 [28]. Previously, only a limited number of small RNAs that reside in exons of protein-coding genes [20, 24] have been reported (Table 1). Here, we have uncovered a total of 201 human and mouse small RNAs including 46 instances that are in the stringently curated miRNA database, MirGeneDB [36,37,38,39], that have genomic coordinates which overlap with coordinates of exons of protein-coding genes (Fig. 3D, Supplementary File 2–5). Relatedly, 32% of these human exonic small RNAs show evidence for processing by Drosha (Fig. 4C). Eight of these exonic small RNAs are common to both human and mouse genomes. Our findings markedly increase the number of exonic small RNA protein-coding host genes relationships with 96% of the cases not being previously appreciated.
While many of the exonic small RNAs here display evidence supporting they may function as miRNAs, many do not. For those exonic small RNAs identified that do not display all of the hallmarks of canonical miRNAs, it is possible that the small RNA may display atypical miRNA features such that it is not classified as a miRNA. For instance, miR-198 (FSTL1), which has been shown to behave experimentally as a miRNA and one of the initial examples of exonic miRNAs in protein-coding host genes [26], is not in MirGeneDB. Ather possibility is that if the small RNA is active, that the small RNA functions by a non-miRNA mechanism. Alternatively, a non-active small RNA may be a by-product of processing of the host gene perhaps for regulatory purposes to destabilize the encoding mRNA similar to DGCR8 [27]. Reasonably, more detailed studies may reveal that some of the small RNAs might represent false positives. Nevertheless, the expression patterns of many of the small RNAs (Fig. 8, Supplementary File 7) and candidate host genes (Fig. 9, Supplementary File 8) does suggest a relationship between a subset of the small RNAs and the identified genes.
Potential implications
Experimental studies of candidate bicistronic mRNAs could aid in uncovering novel post-transcriptional switches. In some cases, poor expression of certain exonic small RNAs under homeostasis may suggest that the production of the small RNA is regulated by additional signals. In those potential instances, inducible-expression is congruent with a scenario where processing of the small RNA compromises the stability of the host protein-coding mRNA (Fig. 10). For example, although hsa-miR-147b-3p is expressed generally at low levels across human tissues (Figs. 8 and 9), this small RNA is well-appreciated to be a functional miRNA that is induced by stress cues [28] such as lipopolysaccharide (LPS) [29]. Data suggest that hsa-miR-147b-3p processing can compromise the expression of the encoding protein-coding host gene (MISTRAV/C15orf48). In particular, C15orf48/hsa-miR-147b highlights that both the miRNA and host mRNA can be co-expressed when induced but also that treatment with volatile triggers shifts the RNA levels largely to the miRNA [28]. In the cases, where these small RNAs act via basepairing interactions, future work may provide new insight into the function of the protein encoded by the host gene by investigating the exonic small RNA targets [28]. Finally, the identity of the candidate protein-coding host genes here may have implications for interpretation of any observed phenotypes in loss of function experiments [28]. Altogether, our studies serve as a resource that may shed new light on the regulation of small RNAs and their relationships with host genes that may be relevant to small RNA researchers and investigators studying specific host genes.
Model for processing of exonic small RNAs from protein-coding host genes. Based on the gene structure and published studies of small RNAs encoded by exons of protein coding transcripts [26,27,28], processing of the small RNA may compromise the host gene mRNA and affect expression of the encoded protein
Data availability
Data analyzed were downloaded from miRBase 22 (https://mirbase.org/download/; last accessed Aug. 07, 2024), RefSeq data from UCSC genome browser table browser (https://genome.ucsc.edu/cgi-bin/hgTables; last accessed Aug. 07, 2024), and ClinVar data from (https://www.ncbi.nlm.nih.gov/clinvar/; last accessed Dec. 16, 2024). Other data including sequence accession numbers from NCBI (last accessed Jan. 1, 2025), and GTEx expression data (last accessed on 05/26/25) are provided within the manuscript, Figures (Figs. 6 and 7), and Supplementary files (Supplemental File 2–8).
References
Bartel DP. Metazoan MicroRNAs. Cell. 2018;173:20–51.
Lai EC. Micro RNAs are complementary to 3′ UTR sequence motifs that mediate negative post-transcriptional regulation. Nat Genet. 2002;30:363–4.
Brennecke J, Stark A, Russell RB, Cohen SM. Principles of MicroRNA–Target Recognition. PLoS Biol. 2005;3: e85.
Mathur M, Nair A, Kadoo N. Plant-pathogen interactions: MicroRNA-mediated trans-kingdom gene regulation in fungi and their host plants. Genomics. 2020;112:3021–35.
Pfeffer S, Zavolan M, Grässer FA, Chien M, Russo JJ, Ju J, et al. Identification of Virus-Encoded MicroRNAs. Science. 2004;304:734–6.
Grundhoff A, Sullivan CS. Virus-encoded microRNAs. Virology. 2011;411:325–43.
Skalsky RL, Cullen BR. Viruses, microRNAs, and Host Interactions. Annu Rev Microbiol. 2010;64:123–41.
Sharma U, Conine CC, Shea JM, Boskovic A, Derr AG, Bing XY, et al. Biogenesis and function of tRNA fragments during sperm maturation and fertilization in mammals. Science. 2016;351:391–6.
Chen Q, Yan M, Cao Z, Li X, Zhang Y, Shi J, et al. Sperm tsRNAs contribute to intergenerational inheritance of an acquired metabolic disorder. Science. 2016;351:397–400.
Denli AM, Tops BBJ, Plasterk RHA, Ketting RF, Hannon GJ. Processing of primary microRNAs by the Microprocessor complex. Nature. 2004;432:231–5.
Han J, Lee Y, Yeom K-H, Kim Y-K, Jin H, Kim VN. The Drosha-DGCR8 complex in primary microRNA processing. Genes Dev. 2004;18:3016–27.
Landthaler M, Yalcin A, Tuschl T. The Human DiGeorge Syndrome Critical Region Gene 8 and Its D. melanogaster Homolog Are Required for miRNA Biogenesis. Curr Biol. 2004;14:2162–7.
Gregory RI, Yan K, Amuthan G, Chendrimada T, Doratotaj B, Cooch N, et al. The Microprocessor complex mediates the genesis of microRNAs. Nature. 2004;432:235–40.
Yi R, Qin Y, Macara IG, Cullen BR. Exportin-5 mediates the nuclear export of pre-microRNAs and short hairpin RNAs. Genes Dev. 2003;17:3011–6.
Grishok A, Pasquinelli AE, Conte D, Li N, Parrish S, Ha I, et al. Genes and Mechanisms Related to RNA Interference Regulate Expression of the Small Temporal RNAs that Control C. elegans Developmental Timing. Cell. 2001;106:23–34.
Hutvágner G, McLachlan J, Pasquinelli AE, Bálint E, Tuschl T, Zamore PD. A Cellular Function for the RNA-Interference Enzyme Dicer in the Maturation of the let-7 Small Temporal RNA. Science. 2001;293:834–8.
Liu J, Carmell MA, Rivas FV, Marsden CG, Thomson JM, Song J-J, et al. Argonaute2 Is the Catalytic Engine of Mammalian RNAi. Science. 2004;305:1437–41.
Meister G, Landthaler M, Patkaniowska A, Dorsett Y, Teng G, Tuschl T. Human Argonaute2 Mediates RNA Cleavage Targeted by miRNAs and siRNAs. Mol Cell. 2004;15:185–97.
Saini HK, Griffiths-Jones S, Enright AJ. Genomic analysis of human microRNA transcripts. Proc Natl Acad Sci. 2007;104:17719–24.
Rodriguez A, Griffiths-Jones S, Ashurst JL, Bradley A. Identification of Mammalian microRNA Host Genes and Transcription Units. Genome Res. 2004;14:1902–10.
Okamura K, Hagen JW, Duan H, Tyler DM, Lai EC. The Mirtron Pathway Generates microRNA-Class Regulatory RNAs in Drosophila. Cell. 2007;130:89–100.
Ruby JG, Jan CH, Bartel DP. Intronic microRNA precursors that bypass Drosha processing. Nature. 2007;448:83–6.
Melamed Z, Levy A, Ashwal-Fluss R, Lev-Maor G, Mekahel K, Atias N, et al. Alternative Splicing Regulates Biogenesis of miRNAs Located across Exon-Intron Junctions. Mol Cell. 2013;50:869–81.
Meunier J, Lemoine F, Soumillon M, Liechti A, Weier M, Guschanski K, et al. Birth and expression evolution of mammalian microRNA genes. Genome Res. 2013;23:34–45.
Melé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J, Sammeth M, et al. The human transcriptome across tissues and individuals. Science. 2015;348:660–5.
Sundaram GM, Common JEA, Gopal FE, Srikanta S, Lakshman K, Lunny DP, et al. ‘See-saw’ expression of microRNA-198 and FSTL1 from a single transcript in wound healing. Nature. 2013;495:103.
Han J, Pedersen JS, Kwon SC, Belair CD, Kim Y-K, Yeom K-H, et al. Posttranscriptional Crossregulation between Drosha and DGCR8. Cell. 2009;136:75–84.
Sorouri M, Chang T, Jesudhasan P, Pinkham C, Elde NC, Hancks DC. Signatures of host–pathogen evolutionary conflict reveal MISTR—A conserved MItochondrial STress Response network. Plos Biol. 2020;18: e3001045.
Liu G, Friggeri A, Yang Y, Park Y-J, Tsuruta Y, Abraham E. miR-147, a microRNA that is induced upon Toll-like receptor stimulation, regulates murine macrophage inflammatory responses. Proc National Acad Sci. 2009;106:15819–24.
Cresswell P, Springer T, Strominger JL, Turner MJ, Grey HM, Kubo RT. Immunological Identity of the Small Subunit of HL-A Antigens and β2-Microglobulin and Its Turnover on the Cell Membrane. Proc Natl Acad Sci. 1974;71:2123–7.
Ortmann B, Androlewicz MJ, Cresswell P. MHC class l/β2-microglobulin complexes associate with TAP transporters before peptide binding. Nature. 1994;368:864–7.
Takata MA, Gonçalves-Carneiro D, Zang TM, Soll SJ, York A, Blanco-Melo D, et al. CG dinucleotide suppression enables antiviral defence targeting non-self RNA. Nature. 2017;550:124–7.
Gao G, Guo X, Goff SP. Inhibition of Retroviral RNA Production by ZAP, a CCCH-Type Zinc Finger Protein. Science. 2002;297:1703–6.
Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005;33 suppl_1:D501–4.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
Fromm B, Billipp T, Peck LE, Johansen M, Tarver JE, King BL, et al. A Uniform System for the Annotation of Vertebrate microRNA Genes and the Evolution of the Human microRNAome. Annu Rev Genet. 2014;49:1–30.
Fromm B, Domanska D, Høye E, Ovchinnikov V, Kang W, Aparicio-Puerta E, et al. MirGeneDB 2.0: the metazoan microRNA complement. Nucleic Acids Res. 2019;48:D132–41.
Fromm B, Høye E, Domanska D, Zhong X, Aparicio-Puerta E, Ovchinnikov V, et al. MirGeneDB 2.1: toward a complete sampling of all major animal phyla. Nucleic Acids Res. 2021;50:D204–10.
Clarke AW, Høye E, Hembrom AA, Paynter VM, Vinther J, Wyrożemski Ł, et al. MirGeneDB 3.0: improved taxonomic sampling, uniform nomenclature of novel conserved microRNA families and updated covariance models. Nucleic Acids Res. 2024;53:D116–28.
Castellano L, Stebbing J. Deep sequencing of small RNAs identifies canonical and non-canonical miRNA and endogenous siRNAs in mammalian somatic tissues. Nucleic Acids Res. 2013;41:3339–51.
Chiang HR, Schoenfeld LW, Ruby JG, Auyeung VC, Spies N, Baek D, et al. Mammalian microRNAs: experimental evaluation of novel and previously annotated genes. Genes Dev. 2010;24:992–1009.
Jones-Rhoades MW. Conservation and divergence in plant microRNAs. Plant Mol Biol. 2012;80:3–16.
Ludwig N, Becker M, Schumann T, Speer T, Fehlmann T, Keller A, et al. Bias in recent miRBase annotations potentially associated with RNA quality issues. Sci Rep. 2017;7:5162.
Langenberger D, Bartschat S, Hertel J, Hoffmann S, Tafer H, Stadler PF. MicroRNA or Not MicroRNA? Berlin Heidelberg: Springer; 2011. p. 1–9.
Meng Y, Shao C, Wang H, Chen M. Are all the miRBase-registered microRNAs true? RNA Biol. 2012;9:249–53.
Tarver JE, Donoghue PCJ, Peterson KJ. Do miRNAs have a deep evolutionary history? BioEssays. 2012;34:857–66.
Taylor RS, Tarver JE, Hiscock SJ, Donoghue PCJ. Evolutionary history of plant microRNAs. Trends Plant Sci. 2014;19:175–82.
Wang X, Liu XS. Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila. Front Genet. 2011;2:25.
Axtell MJ, Meyers BC. Revisiting Criteria for Plant MicroRNA Annotation in the Era of Big Data. Plant Cell. 2018;30:272–84.
Guo Z, Kuang Z, Wang Y, Zhao Y, Tao Y, Cheng C, et al. PmiREN: a comprehensive encyclopedia of plant miRNAs. Nucleic Acids Res. 2019;48:D1114–21.
Fromm B, Keller A, Yang X, Friedlander MR, Peterson KJ, Griffiths-Jones S. Quo vadis microRNAs? Trends Genet. 2020;36:461–3.
Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, et al. Array programming with NumPy. Nature. 2020;585:357–62.
Hunter JD. Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering. 2007;9:90–5.
Waskom M. seaborn: statistical data visualization. J Open Source Softw. 2021;6:3021.
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17:261–72.
Kozomara A, Birgaoanu M, Griffiths-Jones S. miRBase: from microRNA sequences to function. Nucleic Acids Res. 2019;47:D155–62.
Stepanova L, Leng X, Parker SB, Harper JW. Mammalian p50Cdc37 is a protein kinase-targeting subunit of Hsp90 that binds and stabilizes Cdk4. Genes Dev. 1996;10:1491–502.
Kim B, Jeong K, Kim VN. Genome-wide Mapping of DROSHA Cleavage Sites on Primary MicroRNAs and Noncanonical Substrates. Mol Cell. 2017;66:258-269.e5.
Kim K, Baek SC, Lee Y-Y, Bastiaanssen C, Kim J, Kim H, et al. A quantitative map of human primary microRNA processing sites. Mol Cell. 2021;81:3422-3439.e11.
Chen L-L, Kim VN. Small and long non-coding RNAs: Past, present, and future. Cell. 2024;187:6451–85.
Kim H, Lee Y-Y, Kim VN. The biogenesis and regulation of animal microRNAs. Nat Rev Mol Cell Biol. 2025;26:276–96.
Lu J, Shen Y, Wu Q, Kumar S, He B, Shi S, et al. The birth and death of microRNA genes in Drosophila. Nat Genet. 2008;40:351–5.
Nozawa M, Miura S, Nei M. Origins and Evolution of MicroRNA Genes in Drosophila Species. Genome Biol Evol. 2010;2:180–9.
Vitiello A, Potter TA, Sherman LA. The Role of β2-Microglobulin in Peptide Binding by Class I Molecules. Science. 1990;250:1423–6.
Wang C, Peng R, Zeng M, Zhang Z, Liu S, Jiang D, et al. An autoregulatory feedback loop of miR-21/VMP1 is responsible for the abnormal expression of miR-21 in colorectal cancer cells. Cell Death Dis. 2020;11:1067.
Ribas J, Ni X, Castanares M, Liu MM, Esopi D, Yegnasubramanian S, et al. A novel source for miR-21 expression through the alternative polyadenylation of VMP1 gene transcripts. Nucleic Acids Res. 2012;40:6821–33.
Narendra DP, Jin SM, Tanaka A, Suen D-F, Gautier CA, Shen J, et al. PINK1 Is Selectively Stabilized on Impaired Mitochondria to Activate Parkin. PLoS Biol. 2010;8: e1000298.
Valente EM, Abou-Sleiman PM, Caputo V, Muqit MMK, Harvey K, Gispert S, et al. Hereditary Early-Onset Parkinson’s Disease Caused by Mutations in PINK1. Science. 2004;304:1158–60.
Vishnu N, Venkatesan M, Madaris TR, Venkateswaran MK, Stanley K, Ramachandran K, et al. ERMA (TMEM94) is a P-type ATPase transporter for Mg2+ uptake in the endoplasmic reticulum. Mol Cell. 2024;84:1321-1337.e11.
Stephen J, Maddirevula S, Nampoothiri S, Burke JD, Herzog M, Shukla A, et al. Bi-allelic TMEM94 Truncating Variants Are Associated with Neurodevelopmental Delay, Congenital Heart Defects, and Distinct Facial Dysmorphism. Am J Hum Genet. 2018;103:948–67.
Al-Hamed MH, Alsahan N, Tulbah M, Kurdi W, Ali WI, Sayer JA, et al. Fetal Anomalies Associated with Novel Pathogenic Variants in TMEM94. Genes. 2020;11:967.
Heras ML, Szenfeld B, Ballout RA, Buratti E, Zanlungo S, Dardis A, et al. Understanding the phenotypic variability in Niemann-Pick disease type C (NPC): a need for precision medicine. npj Genom Med. 2023;8:21.
Mendell JT, Olson EN. MicroRNAs in Stress Signaling and Human Disease. Cell. 2012;148:1172–87.
Willemsen J, Neuhoff M-T, Hoyler T, Noir E, Tessier C, Sarret S, et al. TNF leads to mtDNA release and cGAS/STING-dependent interferon responses that support inflammatory arthritis. Cell Rep. 2021;37: 109977.
Acknowledgements
We thank Dr. Steve Baker, Jessica Alvarez, and John F. McCormick for comments on the manuscript.
Funding
This study in the Hancks lab was supported by 1R35GM142689-01 to D.C.H. T.C. was supported, in part, by a National Institutes of Health by Immunology Training Grant No. 2T32AI005284-41A1.
Author information
Authors and Affiliations
Contributions
Conceptualization, T.C. and D.C.H.; Data curation, T.C. and D.C.H.; Validation, T.C. and D.C.H.; Investigation, T.C. and D.C.H.; Methodology, T.C. and D.C.H.; Formal analysis, T.C. and D.C.H.; Project administration, D.C.H.; Writing – Original Draft, T.C. and D.C.H.; Writing – Review & Editing, T.C. and D.C.H.; Visualization, T.C. and D.C.H.; Supervision, D.C.H.; Funding Acquisition, D.C.H.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chang, T., Hancks, D.C. A subclass of small RNAs is encoded by exons of protein-coding genes. BMC Genomics 26, 827 (2025). https://doi.org/10.1186/s12864-025-11982-3
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1186/s12864-025-11982-3










