Keywords

12.1 Introduction

Crop production continues to face challenges from a rapidly growing human population, climate change, and weather variability putting pressure on land and water resources. Advances in plant breeding technologies have in the past produced crops that adapt to biological and physical stresses much faster than they appear (Montesinos-López et al. 2018; Pipitpukdee et al. 2020). However, the time required to achieve these genetic gains and feed the world has remained constant for several years. To increase agricultural productivity growth especially for sweetpotato, a substantial investment in innovation, adoption, use, and better identification of the most appropriate technologies and practices for improved performance is required. Increased adoption of improved varieties of sweetpotato is dependent upon increased productivity of the crop. Various biotic and abiotic constraints that affect sweetpotato productivity include sweetpotato virus disease (SPVD), Alternaria bataticola blight, sweetpotato weevils, inadequate and variable rainfall, and low soil fertility (Mwanga et al. 2021b). Other factors that improve adoption include consumer/market preferences and gender (Mwanga et al. 2021a), as well as a well-structured seed system for the crop.

Previously, successfully bred sweetpotato varieties were identified from thousands (50,000–100,000) of seedlings germinated from multi-parent crossing block nurseries (Fig. 12.1). These seedlings would then undergo a series of field trials and testing for several years and seasons which typically lasted eight years. In the early stages of trialing, only a few key traits would be measured because of the many numbers of individuals. In the later stages, a greater number of traits (approximately 40) would be measured on the remaining few high-performance individuals in on-farm and national performance trials. Suitable parents for the next breeding cycle would be identified at this point. In the Eastern African regions, which are a hotspot for SPVD, plants would escape infection in the early stages of testing only to show up a few years later thus complicating efforts to release a truly resistant variety that has acceptable qualities for all the other traits of importance.

Fig. 12.1
figure 1

Classical and accelerated breeding scheme of sweetpotato (© B. Yada 2014, Dissertation, NCSU)

In comparison to other major food crops globally, sweetpotato is regarded as the world’s seventh most important crop species. Yet it has only received modest funding to develop its genomic breeding resources. Recent efforts through the genomic tools for sweetpotato improvement (GT4SP) and the genetic advances and innovative seed systems for sweetpotato (SweetGAINS) projects, both supported by Bill and Melinda Gates Foundation (BMGF), have delivered a suite of genomic tools for improvement of sweetpotato breeding. These tools include two fully sequenced and annotated wild diploid I. trifida and I. triloba reference genomes (Wu et al. 2018); a genome browser, and a sequenced sweetpotato I. batatas hexaploid reference genome (http://sweetpotato.uga.edu/); a sequencing-based genotyping platform for highly heterozygous hexaploid sweetpotato, OmeSeq Qrrs—derived from GBSpoly (Wadl et al. 2018), with supporting bioinformatics tools; three high-density genetic maps of hexaploid sweetpotato—for the BT (Mollinari et al. 2020), TB (Amankwaah 2019) and NKB (Oloka 2018) mapping populations; dosage-dependent SNP calling, phasing, and linkage mapping algorithms for autopolyploids—MAPpoly R package (Mollinari et al. 2020); robust polyploid QTL analysis—QTLploy R package (Da Silva et al. 2020); and a breeding program database—sweetpotato base for the management of breeding data (Morales et al. 2022). Altogether, these tools have been very useful in the sweetpotato breeding community in understanding the genetic architecture of key traits and their underlying molecular mechanisms (Gemenet et al. 2020a; Oloka et al. 2021).

With the available genomic resources on hand, the sweetpotato breeding community is now able to utilize breeding values and marker information to make selections and advancement decisions. We are also working on the proof-of-concept for genomic selection in sweetpotato and have developed three training populations from three breeding programs: (1) the National Crops Resources Research Institute (NaCRRI) population consists of 324 individuals with a primary focus on sweetpotato weevils; (2) the International Potato Center (CIP) population consists of 1200 individuals focusing on sweetpotato virus disease; and (3) the North Carolina State University (NCSU) population consists of 504 individuals focusing on guava and southern root-know nematodes. These populations were phenotyped extensively and genotyped to be used for early parent identification and product development. The vast amounts of data generated can be compared between breeding programs because standard operating procedures were embraced early on that include a standard crop ontology (www.cropontology.org) for trait nomenclature, a breeding program database (www.sweetpotatobase.org) for data management and curation, and electronic data capture using fieldbook app.

The most expensive and time-consuming component of sweetpotato crop improvement is trialing and phenotyping thousands of clones, especially in early stages of breeding. We believe that genetic gain can be achieved faster with genomic-assisted breeding as useful clones are identified earlier on in the breeding program without going through many cycles of trialing (Heffner et al. 2009). Genomic selection has been successfully implemented in a number of crop improvement programs including maize (Bernardo and Yu 2007), cassava (Ozimati et al. 2019), soybean (Duhnen et al. 2017), and eucalyptus (Resende et al. 2012) among several other crops. Exploiting the full potential of sweetpotato world over requires addressing the root causes of yield gaps as well as merging proven genomics-assisted breeding approaches with traditional breeding methods in the crop.

This book highlights where we have come from in sweetpotato crop improvement and our efforts to merge classical breeding approaches to genomics-assisted breeding in applied breeding programs that actively release sweetpotato varieties. Strong and effective breeding programs require good leadership, breeding resources that include a stellar team, significant time, and consistent funding. With these in place, very good returns on investment will be realized as improved varieties that benefit farmers, societies and the environment will be released with high variety turnover and adoption (Mwanga et al. 2021b). We further demonstrate the importance of identifying, training, and bringing together teams to address common problems in food systems. We highlight challenges, lessons learned, and how we have approached different problems encountered as we dissect the complex genome of sweetpotato to improve the traits dearest to sweetpotato stakeholders.

12.2 Genetic Improvement of Sweetpotato Traits

There are numerous traits in the sweetpotato ontology (www.cropontology.org), all of which need to come together perfectly to make the ideal variety. However, some are controlled by single genes and are relatively easier to breed due to their high heritability. In this section, we will concentrate on our efforts to improve the basic traits which are ‘must-have’ traits in breeding programs. These include resistance to sweetpotato weevils, sweetpotato virus disease, and root-knot nematodes. We will also look at ‘value-added’ traits like β-carotene and sugars, and dive deep into how genomic tools are being used to improve them.

12.2.1 Breeding for Resistance to Sweetpotato Weevils

The sweetpotato weevil, Cylas spp., is the most serious insect pest of sweetpotato worldwide. Cylas puncticollis (Boheman) (Fig. 12.2a) and Cylas brunneus (Fabricius) (Fig. 12.2b) are uniquely African species (Downham et al. 2001). The larvae feed on roots but are not always readily observed until they have caused significant damage. Adults are also difficult to detect given their nocturnal habitat. ‘New Kawogo,’ a 1995 released landrace cultivar in Uganda (Mwanga et al. 2001), is reported to be resistant to the notorious weevil species C. puncticolis and C. brunneus. The mechanism of resistance is active and is associated with the high concentrations of six hydroxinnamic acid (HCA) esters mainly on the surface of the roots (Stevenson et al. 2009). These esters were identified as hexadecylcaffeic acid, hexadecylcoumaric acid, heptadecylcaffeic acid, octadecylcoumaric acid, and 5–0-caffeoylquinic acid (Stevenson et al. 2009; Anyanga et al. 2013; Yada et al. 2017a). Host plant resistance provides an effective and long-lasting component of any integrated pest management program. However, the development of high-yielding, commercially acceptable weevil-resistant varieties has not been successful over the years due to the lack of heritable resistance in the existing sweetpotato germplasm pool (Anyanga et al. 2017).

Fig. 12.2
figure 2

Sweetpotato weevils Cylus puncticollis and C. brunneus as seen with the naked eye

A number of recently published studies have contributed much to our understanding of the genetic and biochemical basis of resistance to SPW observed in ‘New Kawogo.’ Yada et al. (2015) identified 12 SSR markers that were associated with SPW resistance in a 287-clone segregating population derived from a biparental cross between ‘New Kawogo’ and ‘Beauregard.’ Thereafter, Anyanga et al. (2017) used the same population to improve our understanding of the biochemical basis of resistance observed in ‘New Kawogo’ and reported the segregation of resistance conferred by hydroxycinnamic acid esters that occur on the surface of storage roots. This work was followed up by Oloka (2018) who developed an integrated genetic linkage map of sweetpotato in the NKB population using single nucleotide polymorphism (SNP) markers. In this work, they were not able to identify any significant QTL for SPW resistance. This was due to a number of factors, some of which they pointed out as insufficient population size, highly heterogeneous weevil population in the different environments where the trials were conducted, and significant genotype-by-environmental effects. Consequently, previous efforts have had limited utility for marker-assisted selection but have provided the foundation for further studies utilizing whole genome markers and robust bioassay phenotyping for SPW. For such a complex trait that requires expensive and laborious phenotyping to address, the need to tackle it using genomic-assisted breeding approaches has never been greater.

For genomic selection (GS) to deliver enhanced genetic gains for the traits of interest, we noted that many aspects of breeding operations had to be redesigned, with the emphasis on accurate phenotyping. Previously, SPW was phenotyped in the field using incidence and severity scores on harvested storage roots 130–150 days after planting. The SPW incidence score is simply the percentage of infected roots in the plot whereas the severity score is a 1–9 scale, where, 1 = no weevil damage on any root and 9 = severe damage symptoms on all roots in the plot (Grüneberg et al. 2019). After harvesting and data collection, storage roots are sampled and brought to the lab for choice and no-choice bioassay experiments (Nottingham et al. 1989). In these bioassays, individual roots are artificially infested with a given number of 10-week-old gravid female weevils and then observed for weevil feeding and oviposition. The number of adults that emerge after the infestation period is also counted and recorded. Research has shown that there is a positive correlation between sweetpotato weevil bioassays and field root infestation (Anyanga et al. 2017).

12.2.2 Resistance to Sweetpotato Virus Disease

Sweetpotato virus disease (SPVD) is a complex synergistic interaction of whitefly-transmitted sweetpotato chlorotic stunt virus (Crinivirus) and aphid-transmitted sweetpotato feathery mottle virus (Potyvirus). Resistance to SPVD has been identified as one of the main biotic stress breeding objectives for sweetpotato breeding programs in SSA as it causes yield losses of 50–90% in many high-yielding susceptible genotypes (Clark et al. 2012). Plants infected by SPVD can be easily recognized visually by growers due to their clear field symptoms which include stunting, chlorosis, mosaics, leaf narrowing, and distortion (Gibson et al. 1998; Clark et al. 2012) (Fig. 12.3).

Fig. 12.3
figure 3

Sweetpotato plant (circled) showing severe symptoms of SPVD infection in the same plot with visually symptomless plants

The main management approach in regions where there is high virus pressure has been removal of visually infected plants from the field and use of clean planting materials (Aritua et al. 2007). This approach is effective in developed parts of the world but not in SSA where there is no formal seed system to ensure adequate and timely supply of clean seed (Tairo et al. 2005). Conventional breeding has not been successful over the years due to the limited sources of resistance in the available germplasm pool. There are few demonstrated sources of resistance present in released local landraces in SSA like ‘New Kawogo’ and ‘Tanzania’ as well as some wild relatives of sweetpotato (Karyeija et al. 2000). However, field screening for virus resistance is both slow and inefficient due to a number of factors. Vector populations fluctuate over seasons and years (Aritua et al. 2007) thus complicating trials to identify agronomically superior genotypes from large (50,000–100,000) populations of F1 clones (Mwanga et al. 2021b). This results in plants escaping infection, only to show severe symptoms of SPVD after three or more years of planting. It may, therefore, not be ideal to use symptom severity as a selection criterion for SPVD in a segregating population (Clark et al. 2012).

A limited number of molecular studies have been conducted to improve our understanding of the resistance of sweetpotato clones to SPVD. The first studies were conducted by Mwanga et al. (2002), who identified two recessive genes, spfmv1 and spcsv1 from a biparental cross between ‘Tanzania’ and ‘Bikilamaliya.’ Amplified fragment length polymorphism (AFLP) and random amplified polymorphic DNA (RAPD) markers were used in this population to identify the two markers. Thereafter, AFLP markers were associated with SPVD resistance using discriminant analysis and logistic regression (Mcharo et al. 2005; Miano et al. 2008). However, the fact that AFLP and RAPD markers are dominant in nature has limited their utility in applied sweetpotato breeding programs. Regression analysis was used by Yada et al. (2017b) to associate simple sequence repeat (SSR) markers to SPVD resistance in a ‘New Kawogo’ by ‘Beauregard’ (NKB) biparental population. They identified seven SSR markers that were associated with resistance to SPVD in the NKB population, but the utility of these markers in other sweetpotato breeding populations has not been realized. It is important to find markers that have no ascertainment bias and are useful across breeding populations outside the study population.

12.2.3 The Root-Knot Nematode

Plant parasitic nematodes are major pathogens of many agricultural commodities globally (Agrios 2004). Root-knot nematodes (RKNs–Meloidogyne spp.) parasitize almost every species of vascular plants (Jones et al. 2013) resulting in billions of dollars in annual crop losses (Sasser 1980). In SSA, particularly Eastern Africa, RKNs affect a range of staple RTB crops including banana, cassava, and sweetpotato (Coyne et al. 2006; Karuri et al. 2017; Akinsanya et al. 2020). Much of the damage by RKNs in SSA goes undetected mainly due to their associations in disease complexes with fungi and bacteria. Infection by RKNs causes cracking of roots and secondary infections reducing the market value of root crops by directly affecting their quality (Oloka et al. 2021). Storage root cracking can also result from late-season rains after a long dry period and produce symptoms slightly similar to those caused by RKN infection. For this reason, storage root cracking is not recommended for use as a standard in management schemes for RKNs in sweetpotato.

In infected fields, neurotoxic nematicides alongside cultural practices have often been used as control strategies for the management of RKNs. Besides their prohibitive cost to small-scale growers of sweetpotato, the health and environmental risk posed by nematicides are great (Chitwood 2003). The use of resistant plant genotypes has always been the safest, most sustainable, and economic control strategy of plant parasitic nematodes. However, the effectiveness of crop improvement efforts depends on breeding program resources, the nematode environment, availability of efficient screening procedures, availability and identification of usable sources of durable resistance, and knowledge of the genetics and inheritance of resistance. To a great extent, these factors determine the breeding method and success of the research. Most widely grown and popular sweetpotato varieties globally are highly susceptible to plant parasitic nematodes (Oloka et al. 2021). In the USA, resistance has been identified in un-adapted clones whereas in SSA, resistance is found in low-yielding landraces with low nutritional value. Integrating this resistance in high-yielding adapted elite clones is the challenge before sweetpotato crop improvement programs.

Sweetpotato is a highly heterozygous autohexaploid crop with large genome and complex genetics. Its high heterozygosity means that when resistance is identified in an un-adapted clone, simple backcrossing to recover elite traits is not possible as this will unmask numerous unwanted lethal genes that are detrimental to the survival of the crop. In order to fast track the breeding process in the improvement of sweetpotato plants to plant parasitic nematodes, traditional breeding needs to be merged with genomic-assisted breeding technologies as well as innovative cultural management practices. To this end, a number of sweetpotato breeding programs have developed training populations responding to their targeted product profiles. The end goal of these efforts is to make breeding more efficient through the routine use of genomic estimated breeding values to identify and select individuals in the breeding program to fast track for release, as well as use as parents for the next generation.

12.3 Modern Breeding Integration

12.3.1 Marker-Assisted Selection

Genomic-assisted breeding has been used widely in a number of plants and animals of importance in agriculture. It includes both marker-assisted breeding (MAB) and genomic selection-assisted breeding (GS) in making decisions regarding crossing, progeny selection, and yield trials. In polyploid crops, including sweetpotato, the presence of multiple genome copies introduces a high degree of complexity which in turn imposes numerous challenges to genome analysis and subsequent implementation in applied breeding programs (Kyriakidou et al. 2018; Ahmad et al. 2023).

Over the past few years, the polyploid community has achieved significant strides in bringing the genetics of polyploid species into the genomics era. The technological barrier of assessing these complex genomes got a breakthrough recently by use of high-throughput DNA sequencing technology. The delivery of massive amounts of DNA through such technology and their subsequent conversion into quantitative, dosage-dependent SNP-binary markers has enabled the analysis of these complex polyploid genomes. The massive amounts of information generated call for the need for powerful computational tools that are able to process raw DNA sequences to identify genetic markers. These markers are then used to construct linkage maps and infer haplotypes, identify the position of candidate genes through QTL mapping, and use the relationship between genotype and phenotype to make informed breeding decisions. This is what we refer to as genomic-assisted breeding in sweetpotato.

The polyploid community, which our group is a significant part of, has made significant advances in developing the tools needed for marker-assisted breeding in sweetpotato to be realized. Some of these tools include: SuperMASSA (Serang et al. 2012) and VCF2SM (Pereira et al. 2018), used for dosage-based variant calling in polyploids; MAPpoly (Mollinari et al. 2020) and polymapR (Bourke et al. 2018), used for integrated linkage map construction in a polyploid biparental population using dosage-based markers; QTLpoly (Da Silva et al. 2020), a software for performing QTL analysis in polyploid species, to mention but a few. QTLpoly combines the phenotypes of each individual in a mapped population with the genotype conditional probability distribution at each genomic position. Therefore, this software can be used to perform a variety of genetic analyses between phenotypes and genotypes including genomic selection and prediction. In selection of superior individuals for crossing and/or advancing in the breeding program, QTLpoly software can be extended beyond QTL analysis to predict genomic breeding values of clones (Da Silva et al. 2020).

We have conducted a number of studies using these new tools, some of which are published and summarized in Table 12.1. All these studies identified the genetic inheritance model of these traits as well as SNP markers that are linked to the traits. We are currently furthering this work by digging deeper into these identified QTL to identify low-cost usable markers for routine application in breeding programs.

Table 12.1 Breeding and genetic studies conducted using new breeding tools for polyploid sweetpotato

12.3.2 Genomic Selection

In modern times, the primary objective of most breeding programs is to predict the genetic value of un-phenotyped individuals, enabling the targeted combination of desirable alleles to enhance the performance of future generations. The utilization of genomic selection (GS) in breeding programs has been demonstrated to effectively enhance genetic gains per unit of time, leading to the rapid identification of superior genotypes and acceleration of the breeding cycles (Heffner et al. 2010; Crossa et al. 2017).

However, the effective implementation of GS in crop breeding requires the utilization of prediction models that can improve the accuracy of predictions across diverse trait-environment combinations. One crucial aspect of plant breeding programs involves conducting multi-environment trials (MET), which aim to evaluate the performance of candidate genotypes under different environmental conditions (Jarquin et al. 2020). Furthermore, the genotype-by-environment (GE) interaction is an essential component of genetic variability, and a good understanding of this phenomenon can assist in identifying stable genotypes or genotypes with specific adaptations (Crossa et al. 2011; Hu et al. 2023).

In breeding programs for clonally propagated species, such as sweetpotatoes, the impact of dominance on the effectiveness of GS implementation may be of critical importance, owing to the heterozygous nature of genotypes and the genetic value being a function of both additive and non-additive gene action (Gemenet et al. 2020b). Consequently, breeders are faced with the arduous task of increasing the additive value over time while simultaneously preserving the dominance value via the selection and recombination of parents (Werner et al. 2023). Batista et al. (2022) demonstrated in sweetpotato and sugarcane that if the trait has a high mean dominance degree and the population has a high frequency of heterozygous genotypes, the digenic dominance effects can significantly improve genomic prediction.

To identify superior parents for breeding programs or predict the potential of cross combinations, a comprehensive understanding of genetic architecture, accurate prediction of individual genetic values, and estimation of genetic parameters are indispensable. Yan et al. (2022) provided a comprehensive overview of the evolution of genetic and genomic tools for sweetpotato improvement, while Gemenet et al. (2020b) evaluated different strategies for genomic selection in this crop, highlighting their significant contributions to advancing genomic selection approaches in sweetpotato breeding. In this regard, a multiple-environment genomic best linear unbiased prediction (GBLUP) model that considers additive and dominance genetic effects can be helpful (Jarquín et al. 2014).

In hexaploid species, the additive matrix and digenic dominance matrices have been described by Batista et al. (2022). The relationship matrices can be calculated computationally using the R package AGHmatrix (Amadeu et al. 2016). The variance components, fixed and random effects in a statistical model are considered unknown quantities that need to be estimated and predicted. The restricted maximum likelihood (REML) method, as proposed by Patterson and Thompson (1971), can be used to estimate these variance components. The fixed effects represented as best linear unbiased estimates (BLUE), and random effects represented as best linear unbiased predictions (BLUP), can be estimated and predicted using mixed model equations (Henderson 1953). R packages such as ‘sommer’ (Covarrubias-Pazaran 2016) and ASReml (Butler et al. 2023) can be used to carry out these procedures.

The significance of the effects model and the quality of the model fit can be evaluated to understand the factors that impact the phenotypic traits and their genetic architectures. The significance of fixed effects can be assessed using a Wald test, while the significance of random effects can be evaluated using a likelihood ratio test (Luke 2017). Additionally, measures of model performance such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) can be utilized to assess the goodness-of-fit of the model. Moreover, the success of genomic selection in breeding programs depends on the model’s ability to predict the genetic value of un-phenotyped individuals with their genotypes recorded. The predictive ability of the model can be evaluated through cross-validation procedures using previously described populations and measures of predictive ability such as the Pearson correlation between phenotypes and genome-estimated breeding values.

12.3.3 Other Omics Technologies

‘Omics’ can be defined as different approaches that aim to measure biological molecules at particular levels through a large amount of data. Next-generation sequencing (NGS) techniques allow the generation of high throughput and fast nucleic acid data for different fields of biological research. The application of multiple ‘omics’ (multi-omics) techniques is essential for exploring the genetic roots of traits through genome composition (genomics), gene expression (transcriptomics), protein analyses (proteomics), and metabolites characterization (metabolomics). Several biological processes were elucidated in different plant species through multi-omics (Yang et al. 2021).

Being a highly heterozygous autohexaploid species with cross-incompatibility, genetic analysis, and molecular breeding of sweetpotato is a challenge. Compared to diploid and major polyploid crops, the ‘omics’ world of sweetpotato is lagging. The situation is mainly due to the complexity of the sweetpotato genome and the relatively small number of researchers working with the species (Yan et al. 2022). The utilization of multi-omics in sweetpotato genetic research could bypass the difficulties imposed by the species complexity and help advance sweetpotato molecular breeding.

12.3.3.1 Genomics

Genomics studies aim to uncover the full genetic content of an organism, employing DNA sequencing and bioinformatic methods to assemble the whole genome contents, identify genes, and determine their structures and functions. Analyses of a high-quality assembled genome can reveal genetic variations that affect desired phenotypes through the identification of sequence polymorphisms and chromosomal arrangements as well as understanding of how the regulation of gene expression affects phenotype expression (Yang et al. 2021). Several diploid species have their genomes fully sequenced and assembled; hence, usage and manipulation of their genomes are highly advanced. Unfortunately, polyploid species are relatively behind in those fields when compared to major diploid crops.

Autopolyploid genomes are usually hard to assemble due to the often high heterozygosity and high level of repetitive DNA in these genomes. For progressing with molecular breeding, a high-quality genome sequence is imperative. The first public draft genome for sweetpotato (Yang et al. 2017) generated 15 pseudochromosomes based on gene synteny with Ipomoea nil genome, and it is available at http://public-genomes-ngs.molgen.mpg.de/SweetPotato/. The assembly is very fragmented and contains a significant amount of redundancy and misassembly (Wu et al. 2018).

Due to the difficulties of assembling a polyploid genome, the generation of genomes from closely related diploid species is an alternative. There are five genomes from diploid Ipomoea species available. I. nil (Hoshino et al. 2016), I. purpurea (Gupta et al. 2023) and I. aquatica (Hao et al. 2021). Although having good assembly quality overall, such genomes are phylogenetically distant from sweetpotato, which hinders their usage as reference genomes. On the other hand, I. trifida and I. triloba are closely related to I. batatas. In fact, I. trifida is considered to be a progenitor of sweetpotato (Roullier et al. 2013) and had its first tentative genome assembly performed by Hirakawa et al. (2015). The first chromosome-level reference genome for I. trifida together with I. triloba genome was published by Wu et al. (2018). The authors used a combination of genome sequencing, RNA sequencing (RNA-seq), molecular mapping, comparative genomics and predicted proteomes from different plant species to fully characterize the assemblies. The alignment of sweetpotato genomic sequences against both diploid genomes proved the genomes as good references for I. batatas, with more than 90% successfully aligned reads. Both genome sequences are available at https://sweetpotato.uga.edu/. A high-quality genome assembly was also constructed for a storage root forming I. trifida genotype (Y22). The analysis of the genome provided evidence of natural horizontal gene transfer from Agrobacterium tumefaciens to I. trifida. The A. tumefaciens sequence is also present in the sweetpotato genome, suggesting that sweetpotato might have inherited the sequence from diploid I. trifida (Lee et al. 2019).

Due to the quality of the I. trifida genome assembly and its similarity to sweetpotato genome, I. trifida genome sequence is being used as a reference genome for sweetpotato in a variety of research subjects such as development and validation of genetic tools for polyploids (Wadl et al. 2018; Mollinari et al. 2020; Yamakawa et al. 2021), linkage and QTL mapping (Da Silva et al. 2020; Oloka et al. 2021) and identification of candidate genes (Bararyenya et al. 2020; Gemenet et al. 2020b). As efforts to develop a high-quality assembly of the sweetpotato genome are ongoing, the availability of the I. trifida genome enabled advancements into sweetpotato genomic research.

12.3.3.2 Transcriptomics

The complete set of RNA transcripts, produced in cells and tissues, is called transcriptome and trancriptomics aims to characterize those transcript sets. Transcriptomics allows identification of putative genes, gene-targeted molecular markers, and genes differential expression profiles regarding different stimuli, time period, or developmental stage, being RNA-seq the most popular technique for transcriptome studies (Yang et al. 2021).

The first high-throughput sequencing of sweetpotato RNAs was performed by Schafleitner et al. (2010). A sweetpotato gene index was generated, and 24,657 putative unique genes were identified. The sequences were further used for short sequence repeats (SSRs) mining and 195 SSR markers were developed, which are used in sweetpotato breeding populations. Since then, transcriptome studies in I. batatas have increased noticeably, and this approach is the most used in sweetpotato molecular research. Transcriptomics was used to characterize gene expression in sweetpotato roots, primarily for the understanding of storage root formation (Wang et al. 2010; Firon et al. 2013; Ponniah et al. 2017), and defense response against root-knot nematode disease through transcriptional changes (Lee et al. 2019, 2021) as well as profile the gene expression in sweetpotato leaves under abiotic stress (Arisha et al. 2020a, b; Kitavi et al. 2023).

RNA-seq data from different tissues and organs of I. trifida and I. triloba were used for quality assessment and annotation of both species’ genome assemblies (see above). In addition, the transcriptome of orange-fleshed cultivar ‘Beauregard’ was generated, and genes involved with storage root formation and β-carotene content were shown to present different gene expression regulation from their diploid species’ counterparts (Wu et al. 2018). Using a combination of differential gene expression analysis, QTL mapping, and genome annotation data, Gemenet et al. (2020a) analyzed the negative association between β-carotene and starch accumulation, which is a result of starch and carotenoid biosynthesis pathways competition for available carbon. Their results indicated that the physical proximity of sucrose synthase (SuSY) and phytoene synthase (PSY) genes affects the balance between pathways, while the Orange (Or) gene regulates PSY expression, acting as a molecular switch for carotenoid accumulation.

12.3.3.3 Proteomics

Proteomics profiles the total protein expression of an organism, analyzing their amino acid sequences, molecular structures, and functional activities. The ongoing improvements in protein extraction and purification protocols facilitated the advancements in the field. Protein data are generated by several methods such as high-performance liquid chromatography (HPLC), crystallization, X-ray diffraction of protein crystals, and yeast two-hybrid systems. In the last years, advanced high-throughput techniques using labeled amino acids with nuclear magnetic resonance (NMR) and mass spectrometry were developed (Yang et al. 2021). In sweetpotato, proteomics was used to compare protein expression profiles among genotypes of different flesh colors (Lee et al. 2012; Shekhar et al. 2016), allowing the detection of increased enzyme antioxidant activity and soluble sugar content in low-temperature storage roots (Cui et al. 2020), and identification of proteins involved in drought and heat simultaneous defense response (Tang et al. 2023).

Comparative proteomics was applied between pencil and storage roots to identify proteins that were up-regulated and/or uniquely expressed among both organ types. Pencil roots were overexpressing proteins related to cell wall, phenylpropanoid pathway, and antioxidant defense responses, indicating a higher rate of lignin biosynthesis and stress-related responses. On the other hand, proteins involved with development (maturity) and defense response against insects were up-regulated in storage roots. mRNA levels were in accordance with the protein expression results and lignin accumulation was only observed in the pencil roots. The authors argue that the carbon flow shift from the phenylpropanoid pathway to carbohydrate metabolism has major importance in storage root formation (Lee et al. 2015).

A broad profiling of the sweetpotato leaf and root organs proteome of the orange flesh cultivar ‘Beauregard’ was performed. In total, 74,255 peptides matching 4321 non-redundant proteins were identified and compared to available Ipomoea species predicted proteins, sweetpotato transcriptome, and genome sequences. More than 700 new coding regions were identified, and the analysis showed that approximately 2000 loci might be misannotated, showing the importance of using different methods to provide quality molecular information. Additionally, proteins unique to each organ were detected. Leaf proteins were mainly associated with primary metabolism and translation, which was associated with the growing activity and generation of metabolites and energy for the plant. Storage root was enriched for proteins involved with primary metabolism, intracellular transport, and protein localization, indicating the role of the organ as a nutrient sink (Al-Mohanna et al. 2019).

12.3.3.4 Metabolomics

Metabolites are small molecules, such as amino acids, lipids, and sugars that are intermediates or end products formed in metabolic processes. These molecules act as cellular structural constituents, building blocks for larger molecules, substrates for enzymatic reactions, and as signals for diverse signaling pathways (Baker and Rutter 2023). Metabolomics analyzes the metabolites that are involved with cellular processes of an organism, being the whole set of metabolites called metabolome. Each group type of metabolites has their own chemical/physical characteristics; therefore, analytical methods differ according to the wanted metabolic profile. The most used techniques are NMR and a combination of gas/liquid chromatography and mass spectrometry. For plants, which are the organisms that most produce metabolites, this field of research is especially important (Yang et al. 2021). Metabolomics studies in sweetpotato only started being reported in the last decade.

Metabolomics, together with proteomics, was used to characterize drought stress response in sweetpotato leaves. A drought-resistant cultivar showed the up-regulation of proteins involved in photosynthesis, reactive oxygen species (ROS) metabolism, and energy generation as well as the accumulation of carbohydrates, amino acids, flavonoids, and organic acids metabolites. Phenylpropanoid biosynthesis pathway-related proteins and metabolites were highly expressed and correlated, indicating a co-regulation among them (Zhou et al. 2022).

Carotenoid, flavonoid, anthocyanin, and phenolic acid metabolites were profiled in white, orange, and purple-fleshed sweetpotato roots. Orange-fleshed sweetpotatoes had the highest level of carotenoids, while white-fleshed genotypes had the lowest level. Flavonoid concentration was higher in purple-fleshed sweetpotato, followed by orange- and white-fleshed varieties. Anthocyanins were virtually only present in the purple-fleshed genotypes, in which six phenolic acid levels are tenfold higher than in white- and orange-fleshed samples. In addition, orange- and purple-fleshed sweetpotatoes had higher concentrations of sugars and sugar alcohols (Park et al. 2016; Wang et al. 2018).

A joint analysis of transcriptome and metabolome data from different flesh-colored sweetpotatoes showed that the regulatory network of anthocyanin production in sweetpotato roots involves not only specific anthocyanins biosynthetic genes as the process is also highly regulated by the flux allocation and modification of metabolites. In addition, the flavonol synthase (FLS) gene was shown to be crucial for the regulation of anthocyanin biosynthesis. Purple-fleshed sweetpotatoes have lower expression of FLS, when compared to white- and orange-fleshed roots, which leads to the different pigmentation among the cultivars (Xiao et al. 2023).

The understanding of metabolite content changes during sweetpotato processing is important as well since the roots are usually cooked before consumption. Heating raised the polyphenol content in the cooked roots and antioxidant activity was higher when compared to raw samples. Antioxidant activity was highly correlated to chlorogenic acids content, which was also enhanced in the cooked sweetpotatoes (Franková et al. 2022).

In addition, cooking promotes the saccharification of sweetpotato roots, a process in which sugars are produced through hydrolysis or acidolysis of starch or cellulose. Maltose is the primary contributor to sweetpotato sweetness after cooking since its content increases sharply, while other sugar metabolites’ content shows no significant change. Using transcriptome and metabolome information, Lee et al. (2021) identified starch synthase (SS), granular starch synthase (GBSS), and branching enzyme (GBE) genes as regulators of the transformation of starch to maltose probably by regulating starch structure. The metabolites identified were enriched by starch and sucrose metabolic pathways. Nevertheless, some of the metabolites had no annotation although being correlated with annotated genes.

12.4 Final Remarks

Due to the polyploid nature and genetic complexity of sweetpotato, the need for genomic-assisted breeding is of significance in accelerating its breeding process. We have developed and applied new genetic tools to identify QTL as well as understand the genetic architecture of a number of important traits in sweetpotato. These new tools, which include the fully sequenced and annotated diploid lines I. trifida and I. triloba, which serve as a reference genome for hexaploid sweetpotato, are steps in the right direction to realize faster genetic gains in sweetpotato and deliver it in farmers’ fields in the form of improved resilient and nutritious varieties.

There are a few technical challenges that we will need to consider and overcome to fully exploit the benefits of these tools, which include the complex family structure of polyploid breeding populations. Typically, this family structure has been in the form of multiple partially inter-related half-sib families coming from a polycross block of about 20 parents. In these populations, only the female parent is known, at best because there are instances where seeds from polycross nurseries are bulked. This family structure has been used by the sweetpotato community for decades in part due to the significantly high levels of cross-incompatibility in sweetpotato, the difficulty in identifying superior individuals in early breeding generations, and the challenge in generating enough seed through targeted crosses.

Genomics-assisted breeding will not remove all these existing hurdles, but it will considerably reduce the need for making multiple ‘blind’ crosses from many parents with unknown genetic backgrounds. It will also eliminate the need to evaluate thousands of early generation clones in multiple environments before realizing their potential as either parents or potential new varieties. When used hand in hand with advanced statistical methodologies and analytics, GS in sweetpotato would allow for evaluation of individuals predicted to have high breeding values for selecting traits of importance in multiple environments at earlier generations. This will greatly accelerate breeding prospects and have a long-term effect of lowering the cost for releasing new varieties to replace old ones in farmers’ fields, thereby increasing genetic gain for traits of interest.