Background

Increasing high temperatures is an all-encompassing issue worldwide. In contrast to humans, plants are sessile, which limits their capacity to withstand the negative impacts of unfavorable environmental circumstances that lead to improper growth and development. High temperature is an unfavorable factor that not only impairs photosynthetic activity and negatively affects cell division and growth but also reduces the yield of crops, threatens food security, and endangers human life [1]. Plants undergo a series of morpho-anatomical and physiochemical processes when exposed to extreme high temperatures, resulting in high levels of reactive oxygen species (ROS) in cells and the accumulation of oxides such as H2O2 and malondialdehyde (MDA), exposing plants to oxidative damage during oxidative stress [2].

To re-stand the continual consequences of stress, plants engage in natural defensive cellular and molecular processes. The molecular mechanism of heat stress responses in plants is well understood. An existing library of genes linked to heat stress have been found and classified, including RNA helicases, heat shock proteins (Hsps), and ring finger proteins, which are involved in heat stress tolerance [3,4,5]. Non-coding RNAs, including as microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), are also thought to influence heat stress responses [2, 6]. Furthermore, regulatory proteins, including CDPKs [7], wall associated receptor-like protein kinases (RLKs) [8], and mitogen-activated protein kinases are crucial for controlling signal transduction in heat-stressed conditions.

Enhancing plant adaptability to environmental stress, the mechanisms regulated by stimulation and subsequent signal transduction often trigger diverse chemical, molecular, biochemical, and physiological changes. Moreover, during stress encounters, plant cells employ the well-known calcium-dependent protein kinase (CDPK) sensor protein family to launch quick signal transduction pathways by inducing phosphorylation cascades [9]. A number of significant families of calcium ion binding proteins, such as calmodulin (CaM), CAM-like proteins (CML), calcineurin B-like proteins (CBL), and Ca2+-dependent protein kinases (CDPK), have also been identified in plants [9, 10]. However, the CDPK family is the largest calcium-sensing family exclusive to plants, protozoa, oomycetes, and green algae [11]. In addition, they are encoded by a wide-ranging gene family that is important for the growth and development of plants, as well as their ability to resist biotic and abiotic stressors [9, 12]. For instance, CDPKs are highly responsive to cold stress in rice seedling stem tissues, involved in light signaling, positively regulate drought, and salt tolerance [13, 14]. Concerning high temperature tolerance, reports have shown that Arabidopsis AtCPK3, which is Ca2+/CaM-dependent, activates HSF to trigger the expression of downstream HSPs that enhance heat tolerance in response to high-temperature stress [15]. ZmCDPK7 is a heat-stress protein kinase involved in ABA signaling and heat-stress tolerance in maize [7], VaCPK29 is involved in grapevine responses to heat stress [16], while GhCDPK60 positively regulates drought stress tolerance in both transgenic Arabidopsis and cotton by controlling proline and ROS levels [17].

As a cash crop of global importance, cotton (Gossypium hirsutum L.) is crucial for national economic development [18]. Cotton thrives in warm temperatures and grows best at high temperatures ranging from 28 to 35 °C. However, temperatures over 35 °C harm cotton growth causing anther dehiscence, reduced pollen activity, and hampered reproductive process [4]. China, India, Pakistan, and Uzbekistan are the top cotton-producing nations, accounting for 80% of the world’s cotton output [19, 20]. Nevertheless, regions with heavy cotton farming often experience temperatures exceeding 40 °C, leading to significant production losses per unit [19]. As a result, screening and breeding plant species for heat tolerance is crucial.

Previous research has identified CDPK in the cotton genome and its impact on fiber development and abiotic stress tolerance [17, 21]. Although the role of the CDPK family in regulating abiotic stress tolerance is well understood, its functional significance and characterization under heat stress are still unclear. To address this gap, we used various bioinformatics techniques, such as phylogenetic analysis, protein-protein interaction analysis, chromosomal distribution, exon-intron structures, collinearity analysis, and identification of cis-regulatory elements and binding sites for transcription factors (TF), to characterize the cotton CDPK protein in the cotton genome. Out of the 48 GhCDPK genes that were identified, GhCDPK16 was found to be associated with heat stress tolerance and was selected for further investigation. Overexpression of GhCDPK16 in Arabidopsis facilitated additional functional roles. The study revealed new information about cotton’s ability to withstand heat stress and indicated that GhCDPK16 could be a promising candidate for genetic improvement.

Methods

Databases and identification of CDPK gene families

The upland cotton genome-wide database was obtained from the website (http://mascotton.njau.edu.cn) [22]. The Arabidopsis genomic database source was obtained at the website (http://www.arabidopsis.org/). Using DNATOOLS software, a local database of the complete genome sequences of Arabidopsis thaliana, Gossypium hirsutum, Oryza sativa, and Zea mays was established. CDPK family genes were screened in the local database using TblastN (E-value = 0.001) for comparison against the structural domain of CDPK’s kinase (PF00069. 16) in A. thaliana. Using the Pfam (http://pfam.xfam.org/) and NCBI (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) databases, all downloaded sequences were subjected to domain search to screen the CDPK sequence gene signature domains (Pkinase, EF-hand). The confirmed CDPK sequences were subjected to multiple sequence alignment using the ClustalW tool in MEGA 6.0 [23].

Physicochemical properties and molecular structure

The ExPAsy (http://www.expasy.org/) online tool was utilized to examine the amino acid sequence, isoelectric point (PI), molecular weight (MW) of the protein, and number of exons. Prediction of N-terminal myristoylation was carried out using the Myristoylator tool (https://web.expasy.org/myristoylator/) [24]. Subcellular localization prediction of proteins was performed using Cell-PLoc 2.0 software (http://www.Csbio.sjtu.Edu.cn/bioinf/Cell-PLoc-2/).

Cis-acting elements of CDPK promoter

For the prediction of cis-acting elements in the promoter region of CDPK genes in G. hirsutum, a sequence located 2000 bp upstream of the coding sequence start site was analyzed using the PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html) online website.

Phylogenetic, gene structure, motif pattern and gene duplication of CDPKs

To conduct tree phylogeny, 152 CDPK protein sequences were used from four plant species including G. hirsutum (48), A. thaliana (34), O. sativa (30), and Z. mays (40). The MEGA6.0 software was used to perform multiple sequence alignment by utilizing the ClustalW 1.8.1 tool. The phylogenetic tree was constructed using the neighboring grouping method, and the p-distance, Jones-Taylor-Tornton (JTT) model, and Poisson model were employed for this purpose. Nonparametric bootstrapping values of 1000 replicates were used to ensure the reliability of the constructed tree [9]. The structure of exons and introns in the CDPK family of G. hirsutum was obtained by using the GSDS online tool (http://gss.cbi.pku.edu.cn/). Based on the protein sequence obtained, the MEME online analysis tool (http://meme.sdsc.edu/) was used to analyze the motif pattern of the CDPK family protein in G. hirsutum [25]. The Plant Genome Duplication Database (PGDD: http://chibba.agtec.uga.edu/duplication/) was used to extract synonymous (Ks) and nonsynonymous (Ka) substitution rates, followed by KaKs computation using the KaKs_Calculator2.0 of Tbtools as previously described [24].

Collinearity, synteny and chromosome localization

The duplication and intraspecific covariance of GhCDPK family members were analyzed using the one-step MCSanX-Super Fast function in TBtools [26]. The evolutionary relationship between pairs of CDPK genes was analyzed using TBtools advanced circos function. Furthermore, to map the chromosomal localization information of G. hirsutum and three other species (A. thaliana, O. sativa and Z. mays), the collinearity connection was established using MapInspect software (https://mapinspect.software.informer.com).

Transcription factor binding site analysis of the GhCDPK family

Putative TFBDS of the GhCDPK family were determined using the G. hirsutum genome database. The predicted TF binding sites in the target genes were analyzed using the Plant Transcriptional Regulatory Map website (http://plantregmap.gao-lab.org/) [27].

Functional network interaction

The online STRING 10 (https://string-db.org/) tool was used to predict the interaction network between GhCDPK16 and interactive protein gene in A. thaliana and G. hirsutum based on text mining, databases, co-expression, neighborhood, gene fusion, co-occurrence, and experimental evidence [28].

Plant materials and heat treatment

The heat-sensitive upland cotton cultivar “FuZi mian” (18YZ589) was employed in this investigation. Seedlings were cultivated in a greenhouse, under optimal growth conditions of 24 °C with 16 h of light and 8 h of darkness. At the two true-leaf stages, seedlings of similar growth were selected and subjected to high-temperature stress treatment at 42 °C. Different samples were collected at different time intervals during a heat stress treatment. The collection times included 0 h of no stress, as well as 1, 2, 4, 8, and 12 h of heat stress. A. thaliana (Columbia) was grown in a conditioned greenhouse with a 16-hour/8-hour photoperiod at 22 °C and a relative humidity of 60%. Seeds of both wild-type and transgenic GhCDPK16-ox Arabidopsis lines were grown on MS plates to observe their response to heat stress treatments. The seeds were treated at 45 °C for 1 h and were then allowed to recover for 5 days for further observation. Next, 7-day-old seedlings of GhCDPK16-ox Arabidopsis lines and wild-type grown on MS plates were treated at 45 °C for 2 h and were then allowed to recover for 7 days for observation and survival rate was further analyzed. Afterwards, 4-week-old GhCDPK16-ox Arabidopsis lines and wild-type were grown in a soil medium and then subjected to 45 °C heat treatment for 24 h and the phonological changes were observed to study the function of genes in response to stress.

SgRNA design, pRGEB32-GhCDPKR16-Cas9 construction and Agrobacterium-mediated transient expression in cotton

The nucleic acid sequence of GhCDPK16 and the corresponding NGG structure were discovered in the CDS. The sgRNA primers for GhCDPK16 were designed using (http://cbi.hzau.edu.cn/crispr) with the forward primer as 5’-TTCTATGGAGAAACTGAGCA-3’ and the reverse primer as 5’-CTAAAGGTCATTTTTCAGGAG-3’. By one-step cloning method, the CRISPR-Cas9 knockout vector pRGEB32 [29] was enzymatically sliced with the restricted intracytoplasmic enzyme BsaI, and the long fragment containing tRNA + target + gDNA was linked to the linear vector. The cloned vector was introduced into the DH5a competent cell; positive clones were selected, sequenced, and gene-editing expression vector pRGEB32-GhCDPK16-Cas9 was obtained.

The vectors (pCAMBIA1305-GhXTH16 and pRGEB32-GhCDPK16-Cas9) were transformed into Agrobacterium GV3101, cultured at 28 °C in 200 µl Luria Bertani (LB) liquid medium with an appropriate amount of rifampicin and kanamycin until the OD600 = 1.5 and the bacterium was collected at 10,000 r/min. The mixtures were then resuspended with a transformation mixture containing 100 µmol/L acetylcholine (AS) and 10 mmol / LMgSO4 to obtain the conversion liquid at an OD600 value of 0.6. Cultivated cotton seedlings were selected and sterilized with 75% ethanol and washed with distilled water to clear the residual alcohol. Subsequently, syringes without the needle were used to inject the vacuum side of the leaf blades, and then the injection sides were properly cleaned with a cotton swab dipped in distilled water. Finally, the injected cotton plants were cultured at 25 °C with a light and dark period of 16 h/ 8 h for 2–3 days. Each treatment consisted of at least 5 seedlings. Successfully transformed plants were validated through qRT-PCR and RT-PCR analysis.

Vector construction and arabidopsis genetic transformation

Vector and genetic transformation was carried out as previously described by [30] with few modifications. Briefly, the coding sequence of GhCDPK16 was amplified using KOD One TM PCR Master Mix (Toyobo, Tokyo, Japan) with gene-specific forward and reverse primers ligated into the pCAMBIA1305 vector (Table S3). Next, the expression vector pCambia1305-GhCDPK16 was introduced into Agrobacterium tumefaciens strain EHA105. The ecotype Col-0 Arabidopsis was used in genetic transformation by the floral dip method. Then the harvested T0 generation seeds were selected on 1/2 Murashige and Skoog (MS) medium with 50 mg/L kanamycin, and the resistant plants were further validated using PCR. Single-copy lines with a segregation ratio of 3:1 were selected and planted until the T3 generation. The transgenic GhCDPK16-ox Arabidopsis lines were grown in a growth chamber with a 16-h light/8-h dark scheme, and the growth temperature was set at around 22 °C.

RNA extraction and real-time quantitative PCR

Total RNA was isolated from fresh 0.1 g samples using the RNAprep Pure Plant Plus Kit (polysaccharides and polyphenolics-rich) according to the manufacturer’s instructions. First-strand cDNAs were synthesized from 1 µg of total RNA using the StarScript III All-in-one RT Mix with gDNA Remover. Then, qRT-PCR analysis was performed from three biological replicates as previously described by [20]. The GhUBQ gene was used as a reference and the relative expression levels were calculated using the 2−ΔΔCt method. The primer sequences used for the experiments are shown in Table S2.

H2O2 and MDA determination and histochemical staining

H2O2 content was measured based on the kit instruction method described previously by [31]. For MDA content determination, 0.5 g of plant leaves was weighed and ground together with 2 mL of phosphoric acid medium solution (PBS) on ice. Afterwards, the mixture was centrifuged at 8000 g for 10 min at a temperature of 4 °C. Next, the supernatant was extracted, and 3 mL of 5% thiobarbital acid was added, then placed in a water bath to boil for 10 min, cooled under 4 °C for 2 to 5 min, then centrifuged again to extract the supernatant. With water as the control, the absorbance was measured at 532 nm and 600 nm. For histochemical staining, leaves were chosen at random and submerged in diaminobenzidine (DAB) and nitro-blue tetrazolium (NBT) solutions, respectively. Chlorophyll was removed by using 95% ethanol, followed by soaking in 70% glycerin for microscopic observation and photography. This entire process was repeated three times and the intensity was quantified using ImageJ software [32].

Determination antioxidant enzyme parameters

To measure the levels of antioxidant enzyme activity, a 0.1 g sample of leaf was ground with 1 mL of ice cooled enzyme buffer. The resulting mixture was then subjected to centrifugation at 12,000 g for 10 min at 4 °C. The supernatant obtained after centrifugation was used for determining the enzyme activity. The enzyme activities of APX, CAT, SOD, and POD were analyzed using different methods. APX was analyzed by measuring the decrease in optical density at A290. CAT was analyzed using the ultraviolet absorption method, while nitroblue tetrazolium NBT and guaiacol methods were used to analyze SOD and POD, respectively [33, 34]. To conduct ascorbate and glutathione content assays, a leaf sample of about 0.1 g was ground into a fine powder using liquid nitrogen and 1 mL of 0.2 M HCl was used for extraction. After centrifugation at 12,000 g for 10 min at a temperature of 4 °C, 0.5 mL of supernatant was mixed with 100 µL of 0.2 M phosphate buffer (pH 5.6). The mixture was then neutralized with 0.2 M NaOH to achieve a pH of 4–5. The extracts were neutralized and spectrophotometric assays were used to measure various compounds, including ascorbic acid (ASA), dehydroascorbic acid (DHA), glutathione (GSH), ascorbic acid/dehydroascorbate (AsA/DHA), glutathione disulfide (GSSG), and glutathione/oxidized glutathione (GSH/GSSG), following previously established methods [35].

Statistical analyses

Statistical analysis was performed using (Excel 2010; Microsoft Office 2010, Microsoft Corp., USA. Graphs were drawn using GraphPad Prism 5.0 (GraphPad Software, Inc., USA). Tukey’s post-hoc tests was used to determine which means differed significantly with P-values (*P < 0.05; to ***P < 0.01). RT-qPCR data were normalized to a normal distribution as previously described by [36].

Results

Identification, characteristics and structural analysis of GhCDPK gene family

A total of 48 GhCDPKs were identified and classified as GhCDPK1 to GhCDPK48, grouped into three main clusters (Fig. S1A). All 48 putative CDPK genes were characterized using ExPASy software. The analysis included determination of the corresponding amino acid numbers, pI, molecular weight, number of exons, subcellular location, and respective EF-hands (Table 1). Based on the predicted gene biochemical properties, the amino acid lengths of GhCDPK proteins ranged from 487 to 655 amino acids. The molecular weight (MW) of these proteins ranged from 54.63 kD to 73.42 kD, and their isoelectric point (PI) varied from 5.06 to 6.96 (Table 1). Most of the GhCDPK genes exhibited 3–4 EF-hand motifs, whereas GhCDPK33, GhCDPK40, GhCDPK41, and GhCDPK42 contain only 2 EF-hand motifs (Table 1). Subsequent analysis revealed that 14 members possess potential N-myristoylation sites, including GhCDPK16, 17, 18, 20, 21, 22, 24, 35, 37, 39, 43, 44, and 48, while the remaining GhCDPK proteins were predicted to be non-myristoylated (Table 1). Based on subcellular localization prediction, all 48 GhCDPK genes are located in the nucleus (Table 1). Further results from the MEME software revealed 15 conserved motifs according to the ascending E-value of the alignment (Fig. 1B). In addition, motif 1–12 was present in all target GhCDPK genes, with the exception of GhCDPK44. Motif 13 was distributed in both subfamilies II and III and partially in subfamily I. The motif 14 was unique to subfamily I, whereas the motif 15 was present in subfamilies I and II but not in subfamily III (Fig. 2B). These findings show that the GhCDPK genes were both specific and conserved throughout evolution. Additionally, the differences in gene structure and conserved motifs indicate the relative conservation of the GhCDPK gene family during evolution and the diversity needed to adapt to the environment. The GhCDPK exon-intron structures showed that the intron numbers ranged from 6 to 7, while the exon numbers ranged from 7 to 9 (Fig. 1C).

Table 1 Characteristics of calcium-dependent protein kinases (CDPKs) in cotton
Fig. 1
figure 1

Phylogenetic tree of CDPK proteins in G.hirsutum, A.thaliana, Z.mays and O.sativa. The phylogenetic tree was derived using the MEGA6.0 software, neighboring grouping method with a bootstrapping value of 1000 replicates. Species abbreviations are listed as follows: Gh: cotton; At: Arabidopsis; Os: rice; Zm: maize

Fig. 2
figure 2

Chromosomal localization, gene duplication, and collinearity analysis of the GhCDPKs. A designated GhCDPK gene locations on the cotton genome. B Fragment duplication analysis of the CDPK gene family in G. hirsutum. The red lines indicate duplicated CDPK gene pairs. C Collinearity analysis of the CDPK gene family in cotton, Arabidopsis, and Zea mays

Phylogenetic relationships and divergence of cotton CDPK proteins

The conserved protein sequences of the GhCDPK and AtCDPK, ZmCDPK, and OsCDPK genes were used to build the phylogenetic tree (Fig. 1). Based on the similar classified results from Arabidopsis CDPKs classification, four distinct groups were identified. Among them, the GhCDPK gene family was primarily distributed in groups I, II, and III, with 24, 18, and 6 representative genes, respectively, whereas group IV contained no GhCDPK gene (Fig. 1). Moreover, group I had 10, 15, and 11 CDPKs of A. thaliana, Z. mays, and O. sativa, respectively. Group II showed 13, 11, and 8 CDPKs; group III revealed a similar number of CDPK representative genes; and group IV showed 3, 6, and 3 CDPK genes of AtCDPK, ZmCDPK, and OsCDPK, respectively (Fig. 1).

Chromosomal localization and gene-duplication analysis of GhCDPKs

To understand gene evolution of the GhCDPK family, the chromosome localization of CDPK genes was analyzed using MapInspect software. Based on the results from Fig. 2A, GhCDPK genes were distributed on eighteen chromosomes, namely, At2, At3, At4, At6, At8, At9, At11, At12, At13, Dt13, Dt1, Dt2, Dt5, Dt6, Dt8, Dt9, Dt11, and Dt12 (Fig. 2A). In addition, most GhCDPK genes were distributed on the chromosome arms, with a few genes, including GhCDPK4, GhCDPK7, GhCDPK5, and GhCDPK43, located in the heterochromatin regions around the centromeric repeats. The chromosome Dt1 contained the largest number of CDPK genes, followed by chromosomes (At6, Dt6, and Dt9) and chromosomes (At8, Dt12, and Dt13) having 5, 4, and 3 chromosomes, respectively. Two chromosomes were located on At3, At9, At13, Dt8, and Dt5, while a single chromosome was found on At2, At4, At11, Dt2, and Dt11 chromosomes (Fig. 2A). Moreover, for the evolutionary relationship among GhCDPK members, the (Ks), (Ka), and Ka/Ks ratio values for each duplication event were calculated (Table S1). The Ka values ranged from 0.002 to 3.366, while the Ks values ranged from 0.026 to 3.104 (Table S1). The Ka/Ks ratio of paralogous pairs including GhCDPK4/27, GhCDPK5/34, GhCDPK31/32, GhCDPK47/48, GhCDPK16/19, GhCDPK17/44, and GhCDPK41/42 segmental duplication was more than one, suggesting that this gene pair underwent positive selection while the other pairs of segmental duplications had Ka/Ks < 1, indicating a negative selection during evolution (Fig. 2B; Table S1). Further analysis was necessary to determine the GhCDPK family’s evolution and development. To ascertain the degree of collinearity among CDPK genes, we examined the genomes of maize, cotton, and Arabidopsis (Fig. 2C). Findings showed that there were 25 homologous gene pairs between the AtCDPK and GhCDPK genes, and 3 gene pairs between ZmCDPK and GhCDPK (Fig. 2C). The cotton CDPK gene shared more homology with dicotyledons than monocots, implying that the cotton and Arabidopsis CDPK families descended from a common ancestor.

Analysis of cis-acting elements of CDPK promoter

The cisregulatory elements of GhCDPK in the upstream sequences (~ 2000 bp) of their promoters linked to phytohormone signaling, plant growth and development, and response to biotic and abiotic stress were examined using the PlantCARE service online tool (Fig. 3; Additional file 1). Plant growth regulatory elements, including light responsiveness, endosperm expression responsiveness, circadian responsiveness, and flavonoid biosynthetic responsiveness, were predicted. The highest number of cisregulatory elements involved in stress, phytohormone signaling, and plant growth and development were found in group I GhCDPK genes (GhCDPK11, 26, 30, 27, 6, 7, 32, 9, 8, and 28) and group II GhCDPK genes (GhCDPK47, 48, 16, 19, 17, and 44) (Fig. 3B). Recorded plant growth and development elements were G-Box (38%), TCT-motif (12%), AE-Box (9%), O2-Site (7%), and CAT-Box (7%) (Fig. 3C). In addition, ABREs (30%), CGTCA-motif (18%), TGACG motifs (18%), and TGA elements (9%), and other phytohormone signaling elements, were also widely distributed in the GhCDPK gene promoter (Fig. 3D). Furthermore, stress response regulation elements for anaerobic responsiveness, defense and stress responsiveness, drought responsiveness, and low temperature responsiveness were found in the GhCDPK genes. Elements such as ARE (58%), TC-rich repeats (14%), LTR (8%), and MBS (10%) were the most abundant (Fig. 3E).

Fig. 3
figure 3

Cis-element prediction in the 2000 bp region upstream of GhCDPKs. A Different types of predicted cis regulatory elements in GhCDPK promoters. B The number of cis-regulatory elements involved in phytohormone signaling, plant growth and development, and abiotic and biotic stress. C-E The type and quantity of the GhCDPK cis elements responsive to plant growth and development, phytohormone signaling, and biotic and abiotic stress, respectively

Predicted TF binding site analysis

TFBDS in CDPK genes were predicted. As displayed in Fig. 4, a total of 26 different TFs were observed in the GhCDPK family, with their length ranging from 7 to 29 bp. (Fig. 4; Additional file 2). Among them, the ethylene responsive transcription factor (ERFs) biding site constituted 60.09% of target sites in the GhCDPK family. Moreover, TFBDS, including LBD, C2H2, and MYB, were found to have a high interaction score with the GhCDPK gene family, indicating that a strong interaction might exist between TFs and the GhCDPK family proteins (Fig. 4). Although stress-responsive sites including ERF, MYB, NAC, TCP, Bzip, and WRKY were found in high quantity on target genes, hormone-related binding sites such as AP2, ARF, bHLH, and reproductive growth sites including ARR, MIKC_MADS, G2-like, and TALE were also recorded, suggesting that the GhCDPK family not only plays vital roles in the response to stress but may also perform some important functions in the hormone transduction and reproductive growth processes (Fig. 4).

Fig. 4
figure 4

Predicted transcription factor binding sites of GhCDPK members in the cotton genome. The tree map chart showed the percentage of the various transcription factors in each GhCDPK family gene

Heat stress responsive expression pattern of the GhCDPK family

To deeply verify the roles of the GhCDPK family in response to heat stress, the expression levels of all 48 target genes were examined under 0, 1, 2, 4, 8, and 12 h of stress treatment (Fig. 5). Numerous GhCDPK genes exhibit increased expression in response to increased stress treatment, as seen in Fig. 5. For instance, following one and two hours of heat stress treatment, the expression patterns of GhCDPK45, GhCDPK40, GhCDPK37, GhCDPK38, and GhCDPK10 were responsively high; however, this response decreased as treatment levels increased (Fig. 5). After 4 h of heat stress treatment, target genes including GhCDPK45, GhCDPK37, GhCDPK15, GhCDPK18, GhCDPK36, GhCDPK23, GhCDPK19, GhCDPK16, GhCDPK28, GhCDPK27, GhCDPK4, and GhCDPK2 were highly upregulated, whereas GhCDPK45, GhCDPK15, GhCDPK18, GhCDPK17, GhCDPK19, GhCDPK16, and GhCDPK27 were significantly expressed after 8 h of stress treatment (Fig. 5). After 12 h of stress treatment, GhCDPK15, GhCDPK16, and GhCDPK19 were highly expressed. However, significant expressions were observed in GhCDPK16 and GhCDPK19 (Fig. 5). Contrarily, the expression levels of GhCDPK46, GhCDPK14, GhCDPK46, GhCDPK1, GhCDPK25, GhCDPK26, GhCDPK30, GhCDPK48 and GhCDPK47 showed no significant responses to heat stress treatments.

Fig. 5
figure 5

The expression profiles and functional network analysis of GhCDPK proteins. Circular heatmap of GhCDPK family genes at different heat stress time points

Predicted functional interaction network of GhCPK16 proteins

The expression pattern of the GhCDPK16 and GhCDPK19 genes validates a possible role in heat stress responses. However, the interacting role with other proteins remains unclear. Thus, an interaction analysis was carried out based on GhCDPK16 to investigate the interaction patterns in A. thaliana and G. hirsutum genome proteins, was required (Fig. 6). The results showed that various ROS generating protein genes in the A. thaliana genome interacted with GhCDPK16 (Fig. 6A) such as RbohD, RbohB and RbohF. In addition, the guard cell S-type anion channel (SLAC1), leucine-rich repeat receptor-like protein kinases (LRR), Ricin B-like lectin (EULS3) and calcineurin B-like (CBL1 and 9) proteins were found to interact with the putative GhCDPK protein in reference to A. thaliana genome (Fig. 6A). In relation to G. hirsutum genome proteins, predicted results showed that GhCDPK16 similarly associated with several respiratory oxidase homolog proteins (RBOH), which may promote ROS content during stress (Fig. 6B). The analysis showed that GhCDPK16 genes interact with three different groups of proteins. The cluster I proteins comprises calmodulin-like (CML-like), calcium-binding protein (CML45 and CML30), and protein time for coffee-like (Fig. 6B). Furthermore, various RBOH-generating proteins were found to interact with the target gene, grouped as cluster IIAB. In this group, several respiratory oxidase homolog proteins, including RbohA-like, RbohB-like, RbohC-like, RbohC-like, RbohD-like, RbohE-like, and RbohF-like, were found as predicted interactors (Fig. 6B). Other proteins including WRKY33, LRR, SLAC1-like, aquaporin (NIP2-1-like), and spastin-like also interacted with the target protein. Furthermore, the cluster gene group III included heat stress transcription factor A-3-like (HSFA-like) and N-(5-phosphoribosyl) anthranilate isomerase-1 chloroplastic-like (TSB1) protein genes (Fig. 6B). This indicated that the GhCDPK16 gene may have a role in heat stress tolerance by regulating ROS generation protein genes.

Fig. 6
figure 6

Predicted interaction networks based on the orthologs in A. thaliana and G. hirsutum. AGhCDPK16 protein interaction in A. thaliana; the colored nodes denote query proteins and the first shell of interactors; the white nodes are the second shell of interactors. Light blue, pink, green, red, blue, black, and purple edges represent known interactions; experimentally known interactions; predicted gene neighborhoods; predicted gene fusions; predicted gene co-occurrences; coexpression; and protein homology, respectively. BGhCDPK16 protein interaction in G. hirsutum: black, green, and blue lines indicate predicted co-expression, textmining and database interaction evidence. Different colored nodes indicate K-means clustering based on their centroids. The edges indicate both functional and physical protein associations

GhCDPK16 expression improves thermotolerance in transgenic plants

Transgenic cotton and Arabidopsis lines were subjected to high temperature stress to examine the function of GhCDPK16. As seen in Fig. 7A and B, transient overexpression and cas9 knockout techniques were employed in this study. The observed phenotype was leaf curling; however, the heat stress effect intensity was significantly higher in wild-type and Cas9 knockout plants than in GhCDPK16-OE lines (Fig. 7C). Furthermore, in comparison to the wild-type and the cas9 lines, the overexpressed lines had higher levels of GhCDPK16 expression (Fig. 7D). For further validation, the expression levels of heat-inducible genes GhHSP70, GhHSP17.3, GhGR1 were dramatically increased in overexpressed cotton lines (Fig. 7E), possibly contributing to cotton’s thermotolerance. In Arabidopsis, thermotoleance validation was conducted at different growth stages. For seed germination responses after an hour of stress treatment 5 days of recovery, results showed an increase in germination rate in GhCDPK16-ox transgenic lines compared to the wild type (Fig. 8A and B). In addition, line 4 and line 5 showed increased germination rates compared to line 3 (Fig. 8B). At the seedling stage of 7 days and 2 h of heat stress, the Arabidopsis line exhibits more significant signs of tolerance to heat stress than the wild type. Moreover, transgenic GhCDPK16 seedlings line 3 and line 5 showed a significant survival rate in comparison to the wild-type (Fig. 8C and D). After 4 weeks of plant growth and high temperature treatments at 45 °C for 24 h, the wild-type leaves showed severe wrinkling and drying, while the leaves of the overexpression lines showed fewer wrinkling compared to the WT (Fig. 8E).

Fig. 7
figure 7

Schematic views of transient expressions and CRISPR-Cas9 technology and thermotolerance assay in cotton transgenic lines. A Agrobacterium-mediated transient transfromation. B Agrobacterium-mediated CRISPR-Cas9 knockout transformation. C phenotype expression of cotton transgenic plants under heat stress. D The expression of GhCDPK16 in transient overexpression in CRISPR-cas9 knockout cotton plants. E Expresion analysis of heat-inducible genes in the transgenic plants. Asterisks * to *** represent significant differences (P < 0.05 to P < 0.01 two-way ANOVA with Tukey’s HSD post hoc test), while no asterisks denote no significant differences

Fig. 8
figure 8

Different growth stages of the GhCDPK16 transgenic line and wild-type responses under heat stress treatments. A Seed germination response to heat stress and 5-day recovery after treatment. B Germination rate analysis. C Seven-day-old seedling responses to heat stress and 7 days of recovery after treatment. D Survival rate analysis. E Phenologic changes of 4-week-old plants under heat stress treatments. SDL denotes seedlings. Each data point represents the mean (with SD bar) of three replicates. Asterisks * to *** represent significant differences (P < 0.05 to P < 0.01 two-way ANOVA with Tukey’s HSD post hoc test), while no asterisks denote no significant differences

GhCDPK16 thermotolerance hinders ROS accumulation

To investigate the expectations of ROS accumulation in transgenic and wild-type plants, the concentrations of H2O2 and MDA were evaluated in transgenic cotton following a 42 °C stress. To be precise, H2O2 content in overexpressed cotton leaves decreased by 0.16 folds compared to the control, whereas GhCDPK16 knockout lines showed a significant increase of 0.28 folds in comparison with the control under heat stress conditions. Contrarily, H2O2 levels under no stress were negligible (P > 0.05) (Fig. 9A). Likewise, MDA content in GhCDPK16-OE cotton leaves greatly decreased by 0.36 folds, contrary to an enhancement of 0.2 folds in knockout lines, respectively (Fig. 9B). Similarly, the accumulation of H2O2 and O2 were evaluated in Arabidopsis leaves using the DAB and NBT staining techniques. Under normal growth conditions, DAB and NBT results showed no significant difference between WT and transgenic Arabidopsis, whereas brown and blue staining increased significantly in WT leaves compared to transgenic plants after 2 h of heat stress (Fig. 10A and B), signifying the presence of a high ROS concentration. Staining intensity levels confirmed the concentration levels in plant leaves (Fig. 10C and D). Furthermore, the levels of H2O2 were quantified in leaf samples. As depicted in Fig. 10E, the H2O2 concentration under normal conditions showed no significant difference between the transgenic plants and the wild-type. However, H2O2 levels were lower in transgenic plants than in wild-type after high-temperature treatment (Fig. 10E). This indicates that ROS may have played a significant role in heat stress treatment.

Fig. 9
figure 9

Overexpression of GhCDPK16 in cotton reduces ROS accumulation. A H2O2 content. B MDA content. C Ascorbate peroxidase (APX) activity. D Catalase (CAT) activity. E Peroxidase (POD) activity. F Superoxide dismutase (SOD) activity. G Glutathione (GSH) content. H Glutathione disulfide (GSSG) content. I Glutathione/oxidized glutathione (GSH/GSSG) content. J Ascorbate (ASA) content. K Ascorbic acid/dehydroascorbate (AsA/DHA) content. L Dehydroascorbate (DHA) content. Asterisks * to *** represent significant differences (P < 0.05 to P < 0.01 two-way ANOVA with Tukey’s HSD post hoc test), while no asterisks denote no significant differences

Fig. 10
figure 10

Histochemical and H2O2 levels in GhCDPK16 transgenic lines. A H2O2 quantity visualized by 3,3-diaminobenzidine (DAB) staining. B Superoxide anion radicals (O2-) quantity visualized by nitroblue tetrazolium (NBT). C and D staining intensity as determined with ImageJ software. E Quantification of H2O2 content. Asterisks * to *** represent significant differences (P < 0.05 to P < 0.01 two-way ANOVA with Tukey’s HSD post hoc test), while no asterisks denote no significant differences

Overexpression of GhCDPK16 increases antioxidants protection under heat stress

Understanding the endogenous amounts of antioxidant compounds under GhCDPK16 thermotolerance will be beneficial in this study in terms of lowered ROS component concentrations. To do so, major antioxidant enzyme levels, including APX, POD, SOD, CAT, ascorbate, and glutathione, were determined in both cotton and Arabidopsis wild-type and transgenic plants. The results showed no significant differences in antioxidant enzyme activities between transgenic plants and WT plants under normal growth conditions. Nonetheless, POD, CAT, APX, and SOD were significantly higher in GhCDPK16-OE transgenic cotton and Arabidopsis lines compared to the wild-type (Figs. 9C-F and 11A-D). Similar increasing trends were observed in ASA, GSH, GSH/GSSG, and ASA/DHA antioxidant concentration levels in the transgenic leaves, but ASA, GSH, GSH/GSSG, and ASA/DHA content under normal temperatures were higher than heat stress (Figs. 9G, I, J and K and 11E, G, I and J). Whereas antioxidant levels in GhCDPK16-OE leaves increased dramatically, GhCDPK16-knockout cotton leaves decreased significantly when compared to the control. Contrary to the above mentioned antioxidants, only DHA and GSSG showed a decreased in antioxidant levels in overexpressed lines when compared to the wild-type (Figs. 9H and L and 11F and H). Furthermore, in cotton genotypes, GhCDPK16-Cas9 levels rose and GhCDPK16-ox antioxidant levels declined.

Fig. 11
figure 11

Endogenous antioxidant enzyme activities in GhCDPK16 transgenic and wild-type plants under heat stress. A Peroxidase (POD) activity. B Catalase (CAT) activity. C Ascorbate peroxidase (APX) activity. D Superoxide dismutase (SOD) activity. E Ascorbate (ASA) content. F Dehydroascorbate (DHA) content. G Glutathione (GSH) content. H Glutathione disulfide (GSSG) content. I Ascorbic acid/dehydroascorbate (AsA/DHA) content. J Glutathione/oxidized glutathione (GSH/GSSG) content. Each data point represents the mean (with SD bar) of three replicates. Asterisks * to *** represent significant differences (P < 0.05 to P < 0.01 two-way ANOVA with Tukey’s HSD post hoc test), while no asterisks denote no significant differences

Discussion

The CDPK gene was first reported to be physiologically activated by calcium ions in pea shoot membranes [37]. After a considerable years of research, the CDPK gene family was found to comprise of a sizable multigene family of CPK proteins that are present in all plants species with additional functional role in stress tolerance [9, 11]. Over the past few years, the characterization and evolution of the CDPK genes in the cotton genome has been extensively explored. For example, 41 putative cotton CDPKs were identified in a diploid Gossypium raimondi [21, 38], eighty-four CDPK genes were identified in G. barbadense, and 96, 44, and 57 CDPKs were identified in Gossypium hirsutum, Gossypium raimondii, and Gossypium arboretum [39] respectively. In addition, the cotton CDPK genes have been reported to be functionally involved in both biotic and abiotic roles such as verticillium wilt resistance, drought stress, and positive regulators of salt stress [17, 39, 40]. However, to date, the abiotic functional role of the plant CDPK gene family in heat stress tolerance has not been well understood. The CDPK biochemical structure of the Gossypium barbadense cultivar showed encoded amino acid lengths ranging from 648 to 155 with molecular weights ranging from 17.99 kDa to 71.854 kDa and predicted isoelectric point ranges from 4.313 to 9.48 [24]. In G. raimondii, G. arboretum, and G. hirsutum CDPK proteins, amino acid length varied from 64 to 907 amino acids, the molecular weight (MW) ranged from 6.726 kDa to 101.033 kDa, and the isoelectric point (IP) ranged from 4.128 to 10.72 [39]. In the present study, the amino acid lengths of GhCDPK proteins ranged from 487 to 655, the molecular weight (MW) ranged from 54.63 kDa to 73.42 kDa, and the PI varied from 5.06 to 6.96 (Table 1).

The tree phylogeny studied among GhCDPK members clustered into 3 major groups, with reference to A. thaliana, Z. mays, and O. sativa (Fig. 1 and S1A). However, contrary to previous reports, the evolutionary relationships among AtCDPKs, OsCDPKs, FaCDPKs, and GrCDPKs revealed four classified groups, indicating a possible divergence during evolution [9, 38]. According to existing reports, AtCDPK23 responds to drought and salt stresses, while GhCDPK60 positively regulates drought stress [17, 41]. In addition, AtCDPK7 and AtCDPK8 cluster in the same subgroup with GhCDPK60 [17]. In the present study, both AtCDPK7 and AtCDPK8 were present in a similar subgroup with GhCDPK41, 42, 46, and 46, signifying a possible involvement in drought stress response. The expression profile of GbCDPK24 revealed high responses after 24 h of heat stress [24]. Moreover, AtCDPK17 and AtCDPK34 were present in a similar group with GbCDPK24 deducing a similar role under abiotic stress responses. In the present work, AtCDPK17 and AtCDPK34 were similarly present in Group II. Additionally, the ZmCDPK7 gene, known to function in heat stress tolerance [7] was present in a similar subgroup with the GhCDPK gene in group II. A genome-wide study in Brachypodium distachyon showed the expression patterns of heat-tolerant BdCDPK genes, including BdCDPK24 and BdCDPK10. Moreover, in references to AtCDPKs, BdCDPK24 was present in the same cluster as AtCDPK34, 17, and 3 [42]. In the study, AtCDPK34, 17, and 3 clustered together in GhCDPK genes in group II, but AtCDPK3 was present in the same subgroup as GhCDPK16/19 (Fig. 1). Thus, we speculated that group II GhCDPK genes may have a role in heat stress tolerance. Orthologous genes are thought to have similar biological functions. Therefore, the function of GhCDPK genes can be inferred from orthologous genes to provide a scientific theoretical basis for subsequent functional studies [24, 43].

A typical characteristic of a CDPK structure is thought to have an N-terminal domain, a junction domain, a calmodulin-like Ca2+ binding domain, a protein kinase domain, and an EF-hand domain [9]. The EF hand allows for calcium binding, and N-myristoylation terminal domains function as part of a primary signaling process that directs proteins to the cell membrane, cytoplasm, or nucleus [9, 44]. We found all CDPKs possessing an EF-hand and a kinase domain; also, 14 putative CDPK genes were observed to have the N-terminal domain, indicating the possibility of Ca2+ binding ability and signaling functions (Table 1). The subcellular locations of CDPKs are widely distributed throughout plant cells. The plasma membrane, cytoplasm, endoplasmic reticulum, nucleus, mitochondria, oily bodies, peroxisomes, chloroplasts, and the Golgi complex are already reported locations of CDPKs [45]. In the present study, all GhCDPK genes were localized in the nucleus (Table 1). The binding sites of cis-regulatory in the gene promoter sequences play a major role in regulating the expression of the genes and controlling the growth, development, and stress responses in plants [27, 46]. As a matter of fact, the structure of the promoter could suggest the possible functions and regulatory mechanisms of the genes [47]. The MYC promoter element was reported to be involved in drought, salt, and ABA stress responses during the plant’s life cycle [48]. In addition, ABRE responded to drought and ABA through ABRE-binding proteins (AREBs) [49]. MYB is known to respond to drought, low temperatures, salt, ABA, and GA stress in plants [50]. Moreover, in Morus atropurpurea Roxb, an ABA response element (MBS, ABRE, and GARE-motif) was reported to positively respond to drought and salt stresses in Morus atropurpurea by interaction with MaCDPK1 [48]. Consistent with the present study, these mentioned promoters were among predicted cisregulatory analysis revealed to associate with GhCDPKs (Fig. 3). Stress-responsive promoter elements such as MBS, ARE, TC-rich repeats, LTR, and cis-regulatory elements involved in phytohormone pathways (ABA, GA, MeJA, and SA) were recorded, suggesting that GhCDPK could stress-responsively interact with plant hormones to defend against plant stress. Furthermore, the highest number of cis-elements was observed in A. thaliana, Z. mays, and P. sativum, implying multiple stress resistance pathways.

Gene expressions are usually controlled by sequence-specific DNA-binding proteins known as transcription factors. These factors recognize specific DNA sequences TFBSs and are thus targeted to specific genomic regions where they can recruit transcriptional co-factors and chromatin regulators to fine-tune spatiotemporal gene regulation [51]. Thus, the identification of TFBSs in genomic sequences is essential for understanding and predicting gene expression. In the present study, we examined the TFB sites of the GhCDPK genome sequence, revealing TFBS sites including ERF, MYB, LBD, WRKY, Trihelix, TCP, NAC, etc. (Fig. 4). The ethylene responsive transcription factor (ERF) was reported to regulate primary and secondary metabolism, growth, and developmental programs, as well as the tolerance of plants to various stresses, including heat stress [52, 53]. The MYB TF responds to various environmental stresses and plays a role in tolerance to high temperatures [54]. The WRKY TF family is known to be involved in high-temperature responses and can bind to the W-box cis-acting elements of target gene promoters, thereby regulating the expression of multiple types of target genes [55]. Moreover, the high number of trihelix, TCP and NAC biding sites indicates plant growth, development regulation, and stress response signaling in the GhCDPK genomic sequence [56,57,58].

The responsive gene expression patterns of all GhCDPKs were clearly presented in Fig. 5A. Among them, GhCDPK16/19 genes were significantly responsive to increasing heat stress as stress time increased. These results suggest the involvement of GhCDPK16/19 in tolerance to abiotic stress, especially heat stress tolerance. Moreover, the underlying biochemical mechanisms are still unknown. Therefore, using protein-protein interactions analysis will be essential towards understanding the mechanisms involved in GhCDPK16 towards heat stress tolerance. The interaction network based on the co-expression analysis revealed that GhCDPK16 is significantly associated with RBOHs, followed by WRKY, LRRs, CBLs, SLAC1-like, HSFs, and CML-related gene families (Fig. 5B). RBOHs are integral membrane proteins that convert superoxide anions into H2O2. The structure of RBOHs are made up of two calcium-binding EF hand motifs and multiple phosphorylation sites at their N-termini, enabling them to participate in the regulation of enzyme activity and regulate responses to abiotic stresses [9]. For instance, RBOHA and RBOHD are required for the accumulation of ROS during the plant defense response against cold stress in strawberries [59]. Moreover, the expression levels of VvRBOHA and VvRBOHB were significantly increased in grapes upon salt and drought stress treatments [9]. Further existing studies confirmed that, H2O2 enhances thermotolerance; therefore mutations in the ROS-generating RBOHs may cause defects in plant thermotolerance [7]. Similarly, in our investigation, transient overexpression of GhCDPK16 in cotton decreased MDA and H2O2 were found to be increased significantly under heat stress, whereas Cas9:GhCDPK16 plants sowed enhanced concentrations of ROS compounds. In addition, tested antioxidant activities were higher in overexpressed lines than in Cas9:GhCDPK16 plants, signifying that GhCDPK16 may have played a role in ROS scavenging and protection of leave membrane damages (Figs. 7 and 9). Thus, we hypothesized that H2O2 and its regulatory gene may have role in cotton tolerance to heat stress. In addition to RBOH proteins, Other heat stress-related proteins interacted with the GhCDPK gene, including LRR, HsfA3, and CML-related calcium ion-binding protein, known to control heat sensors [60] and regulate the heat stress response in Arabidopsis [61].

Thermotolerance is significantly influenced by the abscisic acid (ABA) hormone. Also, the WRKY TFs play a triggering role in activating ABA hormone signals [62]. In Arabidopsis, calcineurin B-like protein 9 (CBL9) was reported to be a calcium sensor involved in abscisic acid (ABA) signaling and stress-induced ABA biosynthesis pathways [63]. Similarly, a loss of WRKY33 function resulted in elevated ABA levels [62]. Moreover, Arabidopsis AtWRKY25, AtWRKY26, and AtWRKY33 promote the expression of ethylene insensitive protein 2 (EIN2) and the ethylene-mediated signal transduction pathway, thereby improving the high-temperature tolerance of plants [55]. Here, CBL9 and WRKY33 proteins interacted with GhCDPK16. We speculate that GhCDPK16 may have interacted with WRKY and CBL proteins to induce hormonal tolerance by regulating ABA and ethylene levels, during heat stress. Crizel et al. [9] further reported that FaCDPK4 and FaCDPK11 utilize ABA as the signaling molecule to increase the content of phenolics, ascorbic acid, and sugar compounds involved in drought stress tolerance. The ABA-mediated AsA-GSH promotes regeneration and improves the antioxidant capacity, alleviates the damage of ROS, reduces growth inhibition, and enhances the tolerance of desiccation stress [64]. Moreover, ABA acts as a defensive stress hormone against plant heat stress by inducing endogenous ABA content, which promotes water balance and strengthens heat tolerance [65]. Under extreme temperatures of 42 °C, 48 h in Vicia faba L and 38/28°C day/night, 7 days in L. esculentum Mill, the AsA, GSH, GSSG, AsA + DHA, and DHA content significantly increased [66]. Consistent with the current study, antioxidant concentrations in the leaves of the transgenic line significantly increased compared to the wild-type, indicating the activation of antioxidant protection (Fig. 8). Furthermore, we speculated that ABA might have played a role in the control of endogenous antioxidant levels.

Conclusions

In this study, a total of 48 predicted CDPKs were identified from upland cotton (Gossypium hirsutum L.), and phylogenetic analyses classified them into three groups. All CDPK proteins are localized in the nucleus and contain EF-hand motifs and kinase structural domains. The predicted ERF, MYB, and WRKY transcription binding sites showed the possibility of abiotic stress tolerance. Significant expression profiles of GhCDPK16/19 during 24 h of heat stress indicated a connection with heat stress responses. Moreover, the protein-protein interaction network analysis highlighted proteins involved in ROS regulation, heat stress TFs, and ABA-dependent Ca2+ signaling. Transient overexpression of GhCDPK16 in cotton and permanent overexpression in Arabidopsis substantially reduced H2O2 content and augmented tolerance by increasing protective antioxidant enzymes through ROS scavenging. This study provides a basis for subsequent studies on the functions of the GhCDPK16 gene family members in cotton and provides a theoretical basis for breeding new heat-tolerant cotton varieties. We propose future functional genomic studies on the interactive role of GhCDPK16 with heat stress-responsive proteins, which will allow deeper elucidation of the GhCDPK16 role in plant heat stress tolerance.