Introduction

Long non-coding RNAs (lncRNAs) are non-coding transcripts usually ≥200 bp long, lacking coding potential, and are primarily species-specific1,2. LncRNAs play a crucial role in regulating gene expression via diverse mechanisms in many complex developmental processes and stress responses3,4,5,6. Based on the genomic position and orientation with proximal protein-coding genes (PCGs), lncRNAs are classified into long intergenic non-coding RNAs (lincRNAs), long non-coding natural antisense transcripts (lincNATs), sense-overlapping lincRNAs and intronic lincRNAs4,7,8. Among them, lincRNAs represent over 50% of the lncRNA population9. LincRNAs share similar features with mRNAs, such as capping at 5′ end, polyadenylation at 3′ end, and splicing9,10. These characteristics allow lincRNAs to be identified using transcriptome-wide approaches (i.e., RNA-seq) designed to monitor mRNAs1,9,10,11. For instance, the FANTOM project identified 13,105 lincRNAs among 19,175 functional lncRNAs in humans. However, the human lincRNA repertoire could be close to 100,000 based on lincRNAs found in developmental stages, tissue types, and disease-specific RNA-Seq datasets in humans and rodents12. The GENCODE consortium (ENCODE project) also released 14,880 evidence-based lncRNA transcripts in humans13.

The different roles of lncRNAs in plants have only recently been investigated. With the availability of high-throughput technologies, lncRNAs were detected in various tissues and growth conditions in Arabidopsis8,14,15. However, functional characterization has been revealed only for a few lncRNAs. For instance, the function of MAS and SVALKA lncRNAs were found to be implicated in cold stimulus by governing the expression of their associated PCGs encoding MADS AFFECTING FLOWEIRNG4 (MAF4) and C-repeat/dehydration-responsive element binding factor (CBF), respectively8,16. In addition to the direct role of lincRNAs in regulating PCG expression, some can also act as microRNA (miRNA) decoys (target mimics) in post-transcriptional gene regulation. For instance, lincRNAs can compete with mRNAs and bind to specific miRNAs, thus interfering with the cleavage or translation of endogenous mRNA targets. However, the repertoire of plant lincRNAs, their involvement in gene regulation, and regulating plant–microbe responses still need to be better understood11,17.

Besides Arabidopsis, numerous lncRNAs were identified in maize (1724), rice (98), Populus trichocarpa (2542), foxtail millet (19), and switchgrass (1597) in response to drought stress5,6,18,19,20. In potatoes, ~2897 lincRNAs were identified during tuber sprouting21. Only a few studies of lncRNAs are available in response to biotic stress in plants. A strand-specific RNA-seq approach was employed to identify natural antisense transcripts (lncNATs) responsive to Fusarium oxysporum in Arabidopsis. Intriguingly, lncNATs were found in ~20% of the Arabidopsis transcripts, suggesting a crucial role of lncRNAs in governing the expression level of the PCGs22. Likewise, ~3181 lncRNAs were identified from the Sclerotinia sclerotiorum-infected Brassica napus and suggested their role in governing response to biotic stress23. LncRNAs from the tomato plants infected with yellow leaf curl virus (YLCV) were identified, and the plausible role of the lncRNAs as competing endogenous target mimics (eTMs) against the miRNAs has been determined24. In potatoes, ~17 lncRNAs were identified in response to infection with Pectobacterium carotovorum (P. carotovorum) (Pcb1692)17. In another study, about ~1565 lncRNAs were identified in response to YLCV infection in tomato24. The results suggest that lncRNAs may play an essential role in diverse biological processes, including host immune response against pathogen invasion via governing expression of stress-responsive PCGs via orchestrated regulatory networks.

Potato is a commercially important vegetable crop providing the bulk of calorie intake25,26. However, the productivity of potatoes is threatened by emerging diseases such as zebra chip (ZC) disease, which is associated with the phloem-limited bacterium Candidatus Liberibacter solanacearum (CLso) and is transmitted by the potato psyllid, Bactericera cockerelli (B. cockerelli). In this study, we characterized potato’s long intergenic RNA (lincRNA) landscapes and studied their role in mediating PCG expression and defense responses to CLso.

Results

Genome-wide identification of potato lincRNAs

To identify and understand the function of lincRNAs during the progression of ZC disease, we performed RNA-seq of healthy (CT; control) and infested potato plants with psyllid vector carrying CLso [CLso(+)] and psyllids without CLso [CLso(−)] temporally at 7, 14, and 21 days after infection (DAI). In total, ~298 million paired-end reads were generated from the temporal RNA-seq datasets (nine conditions having three biological replicates in each condition), covered with ~11.4 million paired-end reads in each biological replicate. The raw sequence data were subjected to quality filtering and mapped to the reference potato genome (Solanum tuberosum (S. tuberosum) v4.04). Of these, ~99% of the reads were high quality, and 54–61% mapped uniquely to the reference genome27 (Supplementary Table 1). Transcripts were assembled using the Read mapping and transcript assembly workflow28, and lincRNAs were identified using Evolinc29. After identifying putative lincRNAs, those expressed below 0.1 transcripts per million were considered artifacts and removed from further analysis.

We identified a total of ~4397 lincRNAs from the nine samples of potatoes representing healthy (CT), CLso(+), and CLso(−) conditions at 7, 14, and 21 DAI (Supplementary Table 1). Previous studies showed that lincRNAs are mostly species-specific9,30,31. For instance, only ~12% of human lincRNAs show sequence conservation with non-human species32. Consistent with this, when we compared potato lincRNAs with those of Arabidopsis and rice (downloaded from the CANTATAdb v2.0 database), we observed poor inter-species conservation (<1%) of the lncRNAs. In contrast, ~35% of potato lincRNAs were conserved with the lncRNAs of other Solanaceous plants (Fig. 1a; Supplementary Data 1a). Moreover, 4379 lincRNAs identified in this study were substantially higher than the previously reported lincRNAs under temporal infection with P. carotovorum subsp. brasiliense in the tolerant and susceptible potato cultivars17. On the contrary, about 3175 lncRNAs were detected from the apical buds of potato tubers21. Therefore, this study expanded the potato lincRNAs repertoire to include novel ones perturbed during biotic stress response.

Fig. 1: Identification and characteristics of potato lincRNAs.
figure 1

a A Venn diagram compares potato lncRNAs available in the CANTATAdb v2.0 database with those detected in the current study. b The distribution of lincRNAs in the 12 potato chromosomes is shown in a pie chart. The fraction (%) of the highest and least number of lincRNAs is depicted. c Lengthwise distribution of the lincRNAs and comparison with protein-coding genes (PCGs) is shown via kernel density plot.

Genomic attributes of potato lincRNAs

The identified lincRNAs were distributed across all chromosomes (Chr) of the potato, with the most found on Chr 1 (12.66%) and the least on Chr 5 (5.93%) (Fig. 1b; Supplementary Data 1b). The transcript length of the lincRNAs ranged from 201 to 5095 bp with an average size of 491 bp, and ~95% of them were shorter than 1 kb. The length distribution of lincRNAs and PCGs revealed that the length of lincRNAs (average length 491 bp) was significantly shorter than that of PCGs (average length 1415 bp) (p value < 0.05, Welch’s two-sided t-test) (Fig. 1c; Supplementary Data 1c). Likewise, we analyzed the structural organization of the lincRNAs. In general, lincRNAs were predominantly mono- or di-exonic, whereas >50% of PCGs contained three or more exons (Fig. 2a; Supplementary Data 2). Consistent with other plant species, including Arabidopsis maize and soybean (Glycine max)3,7,8,11, the proportion of potato lincRNAs containing a single exon (~58%) was higher than that of lincRNAs containing multiple exons.

Fig. 2: Distribution of exons and orientation of potato lincRNAs.
figure 2

a The distribution of exons in lincRNAs and comparison with PCGs is shown in bar plot. b Types of lincRNAs based on orientation concerning their proximal PCGs are graphically depicted.

Previous studies suggested that lincRNAs can influence the expression of their proximal PCGs10,17. We analyzed PCGs located in the proximal (≤50 kb) regions in the up- or downstream regions of the lincRNAs. The average distance of the lincRNAs from their nearest PCGs was ~25 kb away. About 89% of the lincRNAs were located within 50 kb regions of their nearest PCGs, and therefore, we assigned these genes as proximal PCGs hereafter. Further, we analyzed the orientation of the lincRNAs relative to their proximal PCGs. Almost half (49–51%) of the lincRNAs were found to be oriented either in the same or opposite direction respective to proximal PCGs. Among the ~51% of lincRNAs oriented in the opposite direction, convergent type (~36%) represented a major fraction as compared to divergently (~15%) oriented lincRNAs (Fig. 2b). A previous study also showed a higher representation of convergently orientated lincRNAs with respect to their associated proximal PCGs33.

Functional relevance of potato lincRNAs as plausible miRNA decoys

Some lincRNAs could act as miRNA decoys by mimicking the target mRNA sequences, and consequently, the target mRNAs can be protected from degradation34,35,36. For instance, in Arabidopsis, the lincRNA Induced by Phosphate Starvation1 (IPS1) serves as a decoy to ath-miR399. It interferes with the binding of ath-miR399 to its primary target PHO2 transcript and rescue from post-transcriptional silencing in governing phosphate homeostasis in Arabidopsis37. However, lincRNAs that may act as miRNA decoys in potatoes have not been explored. The 4397 potato lincRNAs identified in this study were compared for sequence homology against the known potato mature miRNA sequences in the CANTATAdb v2.0 database. Six lincRNAs (lincRNA.ID.00006654, lincRNA.ID.00002569, lincRNA.ID.00018937, lincRNA.ID.00004487, lincRNA.ID.00010417, and lincRNA.ID.00012834) were predicted to harbor complementary sequences to known miRNAs and could serve as potential miRNA decoys (target mimics). Among them, lincRNA.ID.00006654 and five PCGs harbored binding sites of stu-miR482a-5p. Likewise, lincRNA.ID.00012834 along with three PCGs and lincRNA.ID.00002569, lincRNA.ID.00018937, lincRNA.ID.00004487, and lincRNA.ID.00010417 each with two PCGs harbored binding sites of same miRNAs (Supplementary Fig. 1a; Supplementary Data 3,  8). The PCGs encoded ATP binding protein, expansin, DNA repair protein, multidrug resistance, carboxylic ester hydrolase, a ubiquitin ligase, methyladenine DNA glycosylase, pentatricopeptide repeat protein, and potassium transporter.

Of the six, two lncRNAs (lincRNA.ID.00002569 and lincRNA.ID.00004487) were downregulated in CLso(+) infected tissues at 21 DAI based on the transcriptome analysis (Supplementary Fig. 1b; Supplementary Data 3). However, only one lincRNA.ID.00002569 showed significant downregulation by RT-qPCR (Supplementary Fig. 1c). This lincRNA had complementarity to a potato stu-miR8005a miRNA and similarity to the target PCGs: PGSC0003DMT400008676 (encoding Rad50 ATPase) and PGSC0003DMT400011415 (encoding Multidrug resistance pump protein). The RT-qPCR results also showed that the downregulation of lincRNA.ID.00002569 in CLso(+) infection paralleled the downregulation of PGSC0003DMT400011415 but not PGSC0003DMT400008676 (Supplementary Fig. 1; Supplementary Data 3), suggesting lincRNA.ID.00002569 could act as a potential miRNA decoy of PGSC0003DMT400011415.

Differentially expressed potato lincRNAs in response to CLso and psyllid

LincRNAs are important in regulating diverse biological processes via complex regulatory networks with PCGs. To examine the role of lincRNAs under pathogen infection, we identified sets of differentially expressed lincRNAs under conditions of psyllid challenges without CLso [CLso(−)/CT], psyllid carrying CLso [CLso(+)/CT] and pathogen-specific [CLso(+)/CT]/[CLso(−)/CT] temporally at 7, 14, and 21 DAI. A total of 775 unique lincRNAs were found to be differentially expressed in any of the 9 different comparisons (Supplementary Data 4). A substantially higher number of lincRNAs were found to be differentially expressed at 21 DAI preferentially under [CLso(+)/CT] followed by [CLso(+)/CT]/[CLso(−)/CT] condition (Fig. 3a; Supplementary Data 4). Furthermore, we performed principal component analysis (PCA) of the nine sets of differentially expressed lincRNAs. The first component (PC1) variation was up to 88.4%. We ruled out the extent of variation due to outliers and lack of reproducibility among the biological replicates since we used the differentially expressed values in the log2 scale. Moreover, variation due to time of sampling is unlikely as we observed a distinct and diverse differential expression profile of lincRNAs specifically at 21 DAI (Fig. 3b). The results suggest a dynamic and plausible role of lincRNAs in response to CLso and psyllid infection at the terminal stage of ZC disease in potato.

Fig. 3: Differential expression profiles of lincRNAs modulated during Candidatus Liberibacter spp. (CLso) infection.
figure 3

a Differential expression profile of lincRNAs under CLso(−)/CT, CLso(+)/CT, and CLso(+)/CLso(−) conditions at 7 days after infection (DAI), 14 DAI, and 21 DAI is shown via heatmap. The scale represents differential expression in log2 fold change. b The similarity/difference among the nine sets of differentially expressed lincRNAs described in (a) is shown via a PCA plot. Each data point represents the differential expression status of individual lincRNA.

Further, we specifically determined up- and downregulated lincRNAs among the three sets of comparisons [CLso(−)/CT, CLso(+)/CT, and CLso(+)/CLso(−)] within each stage of infection. For instance, at 7 DAI, 8 and 11 lincRNAs were specifically up- and downregulated under CLso(−)/CT condition, suggesting a psyllid-specific response. Likewise, four and six lincRNAs exhibited up- and downregulation under CLso(+)/CT condition at 7 DAI, suggesting either a psyllid or CLso-specific response. However, only one lincRNA showed upregulation under the CLso(+)/CLso(−) condition at 7 DAI (Fig. 4a; Supplementary Data 4), suggesting a CLso-specific response. Similarly, quite a few differentially expressed lincRNAs exhibiting specific expression in the three sets of comparisons were detected at 14 DAI, too (Fig. 4b; Supplementary Data 4). At 21 DAI, the number of specifically up- and downregulated lincRNAs was substantially high under CLso(+)/CT [up: 133 and down: 161] followed by CLso(+)/CLso(−) condition [up: 20 and down: 34] (Fig. 4c; Supplementary Data 4). These results suggest a dynamic regulation of lincRNAs at the subsequent stages of ZC disease in response to CLso infection preferentially at 21 DAI. Furthermore, we examined the PCGs located in the proximal regions of the specifically up- and downregulated lincRNAs under CLso(+)/CT condition at 21 DAI as they constitute a major fraction (294) of the differentially expressed lncRNAs. Intriguingly, PCGs encoding TFs (HBP and HAT), signaling transduction components (auxin/ethylene signaling genes and serine/threonine protein kinase), and those involved in stress responses, including, cytochrome P450 monooxygenase CYP736B, disease resistance protein BS2, glutathione S-transferase (GST), lipoxygenase, resistance protein PSH-RGH6 and SAUR family protein were detected at the proximal regions of the differentially expressed lncRNAs. The dynamic expression of the lincRNAs may influence the regulation of these PCGs and could impact disease tolerance.

Fig. 4: Comparison of differentially expressed lincRNAs within each landmark stage.
figure 4

Several common and unique lncRNAs exhibiting differential expression under CLso(−)/CT, CLso(+)/CT, and CLso(+)/CLso(−) conditions at 7 DAI (a), 14 DAI (b), and 21 DAI (c) are depicted via Venn diagrams. Labels on top (black), bottom left (red), and bottom right (blue) indicate the number of differentially expressed, upregulated, and downregulated lncRNAs, respectively.

Transcriptional regulatory networks of potato lincRNAs and protein-coding genes

The interactions of lincRNAs and PCGs are complex and poorly understood. However, a few studies revealed that lincRNAs can also influence the expression of their proximal PCGs4,10. We examined to identify sets of co-up and co-downregulated lincRNAs–PCGs pairs located proximal to each other under CLso(−)/CT, CLso(+)/CT, and CLso(−)/CLso(+) conditions at 7, 14, and 21 DAI. We did not observe co-differentially expressed lincRNAs and proximal PCG pairs at 7 and 14 DAI. However, at least 25 co-up and 21 co-downregulated lincRNA and proximal PCG pairs were detected under CLso(+)/CT condition at 21 DAI (Supplementary Data 9). Pearson’s correlation for the set of co-upregulated (r = 0.77, p < 0.01) lincRNA–PCG pairs was higher than the set of co-downregulated pairs (r = 0.31). Among the 46 co-regulated lincRNA–PCG pairs, a majority of them were oriented in opposite directions (29), which were further categorized into convergent (16) and divergent (13) types. Conversely, the remaining 17 lincRNA–PCG pairs were oriented in the same direction. Many of the co-differentially expressed lncRNAs, along with their proximal PCGs, were involved in biotic stress responses38,39,40,41,42,43 such as wound-inducible carboxypeptidase, SAUR, GST, F-box, and chitinases (Supplementary Data 9).

Further, to examine the co-expression networks among transcripts, we performed weighted gene co-expression network analysis (WGCNA)44 to determine sets of co-upregulated transcripts under CLso(−)/CT, CLso(+)/CT, and CLso(+)/CLso(−) conditions at 7, 14, and 21 DAI. At least four distinct modules (turquoise, blue, brown, and gray) representing distinct sets of differentially co-expressed transcripts were detected (Fig. 5a, b). The turquoise module was the largest among the different modules and comprised 331 transcripts (lincRNAs and PCGs) (Supplementary Data 5). The turquoise module was significantly associated with the CLso(+) treatment at 21 DAI (r = 0.93 and p = 1e − 12) (Fig. 5b). Further, we examined the functional relevance of the co-expressed transcripts belonging to the turquoise module by determining the enrichment of gene family and gene ontology (GO) terms using the corresponding PCGs as input. Functional terms related to acyl lipid metabolism superfamily, APETALA2 (AP2)/ethylene-responsive element binding proteins (EREBP), nucleoside transporter, GST, protein kinase and small auxin upregulated RNA (SAUR) processes were significantly enriched (Supplementary Data 10). Previous studies have shown that genes encoding AP2/EREBP, GST, and SAUR play crucial roles in plant defense against pathogen invasion41,42,43. In addition, GO terms involved in cell wall modification (GO:0009827), defense response to bacteria (GO:1900424), ion transport (GO:0006811), DNA binding (GO:0003677), glutathione peroxidase activity (GO:0004602), cell wall (GO:0005618), and plasma membrane (GO:0005886) were found to be significantly enriched (Supplementary Data 10). The results suggest the possible contribution of lincRNAs in governing different biological processes involved in biotic stress responses.

Fig. 5: Identification and interaction of co-expressed transcripts.
figure 5

a A hierarchical clustering of the transcripts exhibiting a co-expression network based on topological overlap dissimilarity is shown via dendrogram. b Different sets (modules) of co-expressed transcripts in CT, CLso(−), and CLso(+) conditions at 7 DAI, 14 DAI, and 21 DAI are shown via heatmap. Labels in each module indicate Pearson’s correlation (R) and significance level (p value). The scale on the right side indicates Pearson’s correlation (R). c A co-expression network of the topmost 24 hub transcripts belonging to the turquoise module is shown. Light-green triangles indicate lncRNAs specific to CLso. Likewise, pink and blue ovals indicate lncRNAs and PCGs, respectively.

Further, we examined regulatory networks among the transcripts of either lincRNAs or PCGs within the turquoise module to identify hub transcripts. A total of 24 transcripts exhibited the most interactions among themselves, and we considered them hub transcripts. Among them, 11 transcripts belonged to lincRNAs, including seven lincRNAs specific to CLso (Fig. 5c). The remaining 13 transcripts were PCGs, including genes encoding expansin, GST, and SAUR. The role of expansin and SAUR genes was implicated in defense responses against different types of biotic and abiotic stresses43,45. Overexpression of expansin genes led to increased tolerance in response to varying abiotic stresses, such as heat, drought, and salinity45. In plants, the accumulation of reactive oxygen species and programmed cell death processes has been implicated in response to pathogen invasion. Glutathione peroxidase activity encoded by GSTs inhibits the spread of cell death in infected plants46. Likewise, the function of SAUR genes in cell wall biosynthesis has been implicated in biotic stress responses43,47. The results suggest an intricate network between coding and non-coding transcripts preferentially at the later stage (21 DAI) of ZC disease progression.

Overexpression of lincRNAs confers potato tolerance to CLso

Next, we sought to characterize if lincRNAs can mediate resistance to CLso infection in potatoes. Four lincRNAs were selected with criteria of (1) co-upregulated along with their proximal PCGs, (4) belonging to the hub transcripts, and (3) progressively induced (7 and 21 DAI) during the progression of ZC disease. The respective lincRNAs were cloned in an overexpression vector under the control of a double-enhanced CaMV35S promoter (DE35S-P). Moreover, the vector harbored a GFP reporter (Fig. 6a). The constructs were used for Rhizobium rhizogenes (R. rhizogenes)-mediated potato hairy root transformation48. After 28 days post-transformation, transgenic hairy roots were analyzed to confirm the overexpression of the four lincRNAs via RT-qPCR and GFP marker fluorescence (Fig. 6b, c; Supplementary Data 6). The transgenic hairy roots showed significantly higher expression of the lincRNAs than control plants (Fig. 6c; Supplementary Data 6).

Fig. 6: Overexpression of lincRNAs conferred tolerance to Candidatus Liberibacter spp.
figure 6

a Schematic diagram showing plasmid vector employed for overexpression (OE) of the selected lincRNAs. b Photographs showing the phenotype of the transformed potato plants at 28 d post-transformation and GFP fluorescence. Scale bars indicate 1 cm (left) and 10 µm (right). c The relative expression level of the lincRNAs between the CT and OE plants estimated via RT-qPCR is shown in the bar plot. d Likewise, the relative expression level of the proximal PCGs located either in up- and downstream regions of the lincRNAs between CT and OE plants estimated via RT-qPCR is shown in bar plots. e Relative Candidatus Liberibacter spp. (CLso) titer between CT and the OE plants estimated via qPCR is shown via bar plot. Asterisks represent statistically significant (p value ≤ 0.05) differences compared to control (CT) as estimated by Student’s t-test. The error bars represent the standard error among the three biological replicates.

To examine the influence of overexpression of the lincRNAs on their proximal PCGs, we selected three PCGs, each located in the upstream and downstream regions of the lincRNA. The expression level of these proximal PCGs was compared between the control and overexpression lines via RT-qPCR. We observed a predominantly negative correlation between higher expression of the lincRNAs and their proximal PCGs expression. For instance, overexpressing lincRNA.ID.00017498 exhibited downregulation of all the six proximal PCGs compared to their controls. Interestingly, RNAi-mediated knockdown of lincRNA.ID.00017498 showed converse effects on the expression dynamics of the proximal PCGs, implicating a role of the lincRNAs in regulating PCG gene expression (Supplementary Fig. 2; Supplementary Data 7). Likewise, overexpression of lincRNA.ID.00013924 reduced the expression of the most proximal PCGs except one located in the upstream region. Conversely, the influence of overexpressing lincRNA.ID.00017902 and lincRNA.ID.00013620 represented both higher and lower expression of their proximal PCGs (Fig. 6d; Supplementary Data 6). The results suggest that lincRNAs could govern the expression level of their proximal PCGs both negatively and positively. Previous studies have also shown that lincRNAs can influence the expression of their proximal PCGs4,10. The functional annotations of the proximal PCGs, with altered expression in the lincRNA-overexpressed hairy roots, included genes encoding proteinases, ubiquitin-protein ligase components, B2 proteins, and ribosomal proteins. Many of these genes could impact plant biotic stress responses49,50,51. Next, we determined whether overexpression of candidate lincRNAs impacted potato tolerance to CLso. Overexpression of three of the four lincRNA exhibited significantly lower CLso titers in transgenic hairy roots compared to control roots transformed with an empty vector. The amplitude of CLso titer reduction was about sixfold in lincRNA.ID.0007902 and lincRNA.ID.0007498 overexpressing hairy roots, while about halffold in the lincRNA.ID.0003924 hairy roots (Fig. 6e; Supplementary Data 6). The results suggest that these three lincRNAs could also impact CLso accumulation.

Discussion

Long intergenic RNAs are emerging as crucial players in the transcriptome programming of PCGs during various plant and animal developmental processes and stress responses. However, the study of lncRNAs is largely underrepresented, although they can be identified from the transcript sequences obtained from RNA-Seq data, a few studies employed strand-specific RNA sequencing to identify natural antisense transcripts (lincNATs)7,8,22. Intriguingly, lincNATs were found to be up to 20% in the Arabidopsis PCGs, implicating a crucial role of lncRNAs22. The repertoire of lncRNAs is further expanded due to variable regions of origin such as lincRNAs and intronic lincRNAs4,7,8, and also due to polymorphic alternatively spliced variants. Although less attention has been paid to lncRNAs earlier, leveraging the availability of a large number of RNA-Seq data can be employed to identify and unravel the roles of lncRNAs in different biological processes. A few plant lncRNAs have been characterized, and their roles are crucial in plant growth and development. For instance, vernalization-induced expression of the COLDAIR lncRNA triggered flower initiation by repressing the FLC locus by recruiting polycomb repressive complex 252. Another classic example is lncRNA PS1, which competes with PHO2 mRNA due to harboring a competing binding site of a miRNA (ath-miR399). Eventually, post-transcriptional degradation of the PHO1 mRNA is rescued under low phosphate stimulus in Arabidopsis37. LncRNAs can also govern the spread of heterochromatization to the neighboring genomic/chromatin region52. Prioritization of candidate lncRNAs is crucial to studying the role of lncRNAs. Therefore, genome-wide identification followed by assessment of dynamic regulation are prerequisites for understanding lncRNAs. Moreover, the role of lncRNAs in response to pathogens is largely unknown.

Here, we identified potato lincRNAs and studied their role in response to CLso infection. Our analysis identified ~4397 lincRNAs in healthy and infected potatoes at different stages of ZC disease progression. The fidelity of the identified lincRNAs was corroborated by shorter transcript lengths and structural features of the exons and introns33,53. The number of detected lincRNAs in this study was substantially higher than previously reported potato lncRNAs17,21. This could be due to the activation of several lincRNAs that respond primarily to the pathogen (CLso) and the insect vector (psyllid). Therefore, our study expanded the potato lincRNAs repertoire by uncovering novel plant–pathogen/insect-responsive lncRNAs.

Furthermore, by sequence comparisons, we identified six potato lincRNAs whose sequences had complementarity to known potato miRNAs (Supplementary Data 8). Previous reports showed that a few lncRNAs may act as potential decoys of miRNAs, thus influencing the miRNA primary target PCG(s) expression or protein accumulation. Interestingly, a lincRNA (lincRNA.ID.00002569), with a sequence similarity to a potato miRNA (stu-miR8005a), was downregulated in CLso infection and paralleled the expression of a PCG target encoding a Multidrug resistance pump protein (PGSC0003DMT400011415) (Supplementary Fig. 1). Although further biochemical and functional studies are needed to understand the cellular interactions of the lincRNAs, miRNAs, and the target PCGs, our results implicate the role of potato lincRNAs as potential miRNA decoys.

It is well established that the host immune responses against pathogen infection can be dynamic and oscillate during the disease progression54,55. Recent studies have implicated lincRNAs in regulating plant defense hormone pathways56,57. Here, we analyzed differentially expressed lincRNAs in nine different sets of comparisons described earlier. The majority, approximately 86.5% (671), of the differentially expressed lincRNAs were detected at 21 DAI stage under CLso+ condition [CLso(+)/CT]. Conversely, the fraction of differentially expressed lincRNAs under psyllid alone condition at 21 DAI [CLso(−)/CT] was substantially low (97; 12.5%). This suggests that most of the unique lincRNAs that are differential regulated are primarily a response to pathogen infection with greater preference under CLso stimulus at 21 DAI.

The role of lincRNAs in influencing the expression of PCGs is complex and poorly understood. Growing evidence showed that lincRNAs can govern nearby PCGs by proliferating similar chromatin structures locally10,58. Here, we identified several lincRNA-proximal PCGs that appear co-regulated with the lincRNAs (Supplementary Data 9). These PCGs had putative functions in biotic stress response, such as chitinase, GST, F-box, and wound-inducible carboxypeptidase. Likewise, using weighted gene co-expression analysis, we identified another set of PCGs co-expressed with lincRNAs (Fig. 5). The most significant and highest correlated module was detected under the CLso(+) condition at 21 DAI. These PCGs include EREBP, GST, and SAURs (Supplementary Data 10)41,42,59. These results suggest that the potato lincRNAs may influence the expression of PCGs involved in biotic stress and plant signaling.

The role of selected lincRNAs in potato defenses was evaluated by overexpressing four candidates in transgenic potato hairy roots (Fig. 6a–c). The overexpression of the lincRNAs modulated the expression of their respective neighboring PCGs (Fig. 6d). Interestingly, our analysis indicates that these disease-associated lincRNAs could predominantly act as transcriptional repressors of PCGs. Strikingly, we observed a significant reduction of CLso titer when three of the four lincRNA candidates were overexpressed in plant tissues (Fig. 6e), implicating these lincRNAs in resistance to CLso. These lincRNA–PCG modules are prime candidates for further mechanistic studies to determine their mode of action in disease resistance.

Methods

Plant materials, inoculation with pathogens, and RNA sequencing

Potato (S. tuberosum L. var. Atlantic) plants were grown inside growth chambers at 20 °C, 50% relative humidity with a periodic setting of 14 h light and 10 h dark. About 4–6 weeks old plants were challenged with psyllid alone [B. cockerelli (Šulc)] [CLso(−)] and CLso vectored by the psyllid [CLso(+)]. The control (CT) plants were unchallenged plants. Young leaf tissues from CT, [CLso(−)], and [CLso(+)] potato plants were collected at 7, 14, and 21 DAI in three independent biological replicates. The leaf tissues were immediately frozen in liquid nitrogen and stored at −80 °C until further use. Total RNA was extracted from the leaf samples using the Direct-zol RNA Miniprep kit (Zymo Research, Irvine, CA). The quality of the RNA was estimated by analyzing the absorbance ratio at 260/230, and the quantity was determined using a NanoDrop-1000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA). RNA-seq library preparation and sequencing was performed at the Texas A&M AgriLife Genomics and Bioinformatics Services (College Station, TX). Libraries were prepared using TruSeq Stranded Library Synthesis with Ribo depletion (Illumina, San Diego, CA). Libraries were multiplexed and sequenced using a HiSeq 4000 instrument (Illumina, San Diego, CA) to obtain 150 bp long paired-end reads.

Identification and differential expression analysis of lincRNAs

The raw paired-end reads were processed to remove low-quality reads, adapter sequences, and reads containing excessive uncalled bases60. The filtered high-quality reads were mapped to the S. tuberosum reference genome (DM v4.04)27 using HISAT261,62. The SAM files were converted into sorted BAM files using SAMtools (v1.3.1)63. LincRNAs were identified from the mapped BAM files via modules implemented in Read Mapping-Transcript Assembly-Evolinc29. The high-quality reads mapped uniquely to the reference genome were used to estimate read counts via the featureCounts tool64. Subsequently, the read count matrixes of two samples (conditions) were used as input to identify differentially expressed lincRNAs using DESeq2 (v1.28.1)65. In total, nine different sets of differentially expressed lincRNAs were determined under CLso(−)/CT, CLso(+)/CT, and CLso(+)/CLso(−) conditions each at 7 DAI, 14 DAI, and 21 DAI. Transcripts with ≥1 and ≤ −1 log2 fold change and <0.05 p value adjusted with the Benjamini–Hochberg method were considered as differentially expressed. Similarity/difference among the nine sets of differentially expressed lincRNAs was analyzed using PCA with factoextra (v1.0.7) and FactoMineR (v2.4) packages implemented in the R program. We used the log2 transformed differential expression values of the nine comparisons under the CLso(+)/CT, CLso(+)/CT, and CLso(+)/CLso(−) conditions at 7, 14, and 21 DAI for the PCA analysis. To examine common and unique sets of up- and downregulated lincRNAs within each disease stage, we employed the Venn Diagram (v1.6.20) package implemented in the R program. The Nextflow pipeline was run on the Texas A&M High-Performance Research Computing (Grace) cluster with 50 computing nodes (48 cores and 384 GB RAM for each node).

Analysis of lincRNAs harboring binding sites of microRNAs (miRNAs)

The lincRNAs that act as potential miRNA decoys (miRNA target mimics) were predicted using the psMimic standalone tool36. The parameters used for miRNA decoy prediction include no more than four mismatches and two mismatches allowed between the second and eighth positions of miRNA. The putative target of identified miRNAs was identified using the psRNATarget36 using potato transcriptome sequences (Phytozome 13, 448_v4.03). We selected the top transcript targets based on the expectation I value (<3). Subsequently, we examined the expression level of those lincRNAs and PCGs in CT, CLso(−), and CLso(+) conditions at 21 DAI via RT-qPCR, and the primer sequences have been provided in Supplementary Table 2.

Differential co-expression network analysis

To identify co-up and co-downregulated lincRNA and PCG pairs under CLso(−)/CT, CLso(+)/CT, and CLso(+)/CLso(−) conditions at 7 DAI, 14 DAI, and 21 DAI, WGCNA was performed44. The log2 fold change differential expression values of the 9 comparisons were used as an input with the criteria of soft thresholding power of 5 and minimum size of 30 using the dynamic tree cut method. Distinct sets/modules of co-up and co-downregulated lincRNAs and PCGs pairs were visualized using plotDendroAndColors. Moreover, the topmost 24 hub transcripts (either lincRNAs or PCGs) based on the highest number of interactions found within the turquoise module were identified and displayed in Cytoscape (v3.8.2)66. Further, we examined the functional relevance of the co-up and co-downregulated PCGs via enrichment analysis of GO terms and gene family via GenFam tool (Supplementary Data 10)67,68.

Overexpression and knockdown of lincRNAs in potato hairy roots

For overexpression, the selected four lincRNAs were commercially synthesized and cloned at StuI and KpnI restriction sites in a binary vector containing the GFP reporter gene (pBIN-mGFP) under the control of a double-enhanced CaMV35S promoter (DE35S-P) and 35S terminator (35S-T). The constructs were subsequently used for R. rhizogenes-mediated potato hairy root transformation48. For RNAi-mediated knockdown of lincRNA, a hairpin RNA containing a PDK (pyruvate orthophosphate dikinase) intron flanked by the sense and antisense lincRNA sequence was commercially synthesized and cloned at StuI and KpnI restriction sites in the pBIN-mGFP binary vector. Briefly, CLso-infected potato shoots were transformed with either the lincRNA overexpression or RNAi constructs and the empty vector (GFP alone) using R. rhizogenes (ATCC 43056). The composite plants’ newly formed transgenic hairy roots were verified by microscopy based on GFP fluorescence at ~28 days post-transformation. RT-qPCR confirmed the overexpression or knockdown of the lincRNAs in the transformed hairy roots. Total RNA was extracted from the transgenic hairy roots (three biological replicates) using the Direct-zol RNA Miniprep Plus kit (Zymo Research, Irvine, CA). First-strand cDNAs were synthesized from 1.0 µg of total RNA using Superscript IV First-Strand Synthesis System (Thermo Fisher Scientific, Waltham, MA) followed by qPCR analysis. Further, the expression of the six nearest PCGs (three each located in up- and downstream regions of the lincRNAs) was analyzed by RT-qPCR to determine their expression changes.

RT-qPCR and qPCR analysis

To examine the expression level of lincRNAs and PCGs, we employed RT-qPCR. Primers were designed using NCBI and Primer3web (v4.1.0)69 (Supplementary Tables 3, 4). Relative CLso titer was estimated at ~28 days post-transformation using qPCR. All qPCR and RT-qPCR reactions were performed using iTaq SYBR Green Supermix (Bio-Rad Laboratories, Hercules, CA) with three biological and two technical replicates. The potato housekeeping gene RPL2 was used as an internal reference. Relative expression and bacterial titers were analyzed using the ∆∆Ct method.

Statistics and reproducibility

The details of the statistical tests to identify the significance level of differences have been described in the “Results”, “Methods”, and figure legends. The RNA-seq data from nine different samples under CT, CLso(−), and CLso(+) conditions at 7 DAI, 14 DAI, and 21 DAI in three independent biological replicates were processed to identify lincRNAs. The differential expression of the lincRNAs between two samples/conditions was detected using default algorithms implemented in DESeq2 (v1.28.1). Furthermore, distinct modules of co-expressed transcripts were identified with default algorithms implemented in WGCNA. The significance of the difference in the expression level of the lincRNAs and their proximal PCGs between the wild type and mutant plants (overexpression/RNAi) was detected using Student’s t-test.

Reporting summary

Information on the research design and ethics is available in the Nature Research Reporting Summary linked to this article.