Introduction

Differences of sex development (DSD) is a group of congenital conditions that are defined by a discordance of the chromosomal, gonadal or phenotypic features of the internal and/or external genitalia [1]. DSD is a significant health concern among mammals, including pigs, goats, dogs, and even humans [2,3,4,5]. Based on the sex chromosome constitution, DSD can be classified into three categories: sex chromosome DSD, XX DSD, and XY DSD. In pigs, XX DSD (SRY-negative) is the most common type, which were also called intersexes or sex reversal in the past. Typical features of this condition include 38, XX karyotype, the intersex phenotype of genitalia, and the testis-like gonads or ovotestis [4]. XX DSD pigs are infertile, more susceptible to urogenital infections, and have reduced growth and survival rates. In addition, they exhibit a strong boar taint which diminishes carcass quality and may display aggressive behavior, ultimately compromising the economic returns of pig breeders [6]. The heritability estimates for XX DSD pigs are high, ranging from 0.72 to 0.81 [7, 8]. The incidence of this condition in various pig breeds ranges from 0.08 to 0.75%, but in certain inbred populations, it can be as high as 20%. XX DSD pigs seem to have a strain-dependent predisposition because, in most reported cases, the affected individuals belong to the Landrace pig or Large White pig or their hybrid [7, 9].

The molecular background of XX DSD (SRY-negative) has been extensively studied in humans. Multiple genetic variants have been linked to 46, XX testicular or ovotesticular DSD, including RSPO1 [10, 11], WNT4 [12], NR5A1 [13, 14], NRF2 [15, 16], WT1 [17, 18], SOX3 [19], SOX9 [20, 21], and SOX10 [22]. Despite progress in understanding the etiology of 46, XX testicular or ovotesticular DSD, the etiology of some cases is still unknown [23]. However, the causative variants for XX DSD have only been identified in goats among domestic mammals. In goats, the causative variant for the polled intersex syndrome is a complex structural variant resulting from a 10.1 kb fragment deletion and a duplicated 480 kb segment insertion, which affects the expression of FOXL2 [7, 24]. The molecular background of XX DSD pigs remains incompletely understood. It had been suggested that XX DSD pigs are caused by an unknown autosomal recessive mutation [25, 26]. The SOX9 was ever considered the primary candidate gene for investigating the molecular background of 38, XX DSD (SRY-negative). Rousseau et al. [8] conducted a genome-wide association analysis, and nine SNPs were identified in XX DSD pigs, with the rs81341093 (M1GA0024789) haplotype being the most significant. The authors also detected polymorphic loci in SOX9 (10 SNPs and four indels) and the TESCO enhancer (14 SNPs and one indel), but none of these were co-segregated with the XX DSD phenotype. Brenig et al. [27] discovered an 18 bp deletion (Δ18, ss2137497745) in the 5’ UTR region of the SOX9 gene and suggested that this deletion decreased SOX9 transcriptional and translational activities. However, subsequent research by Stachowiak et al. [28] failed to find a correlation between Δ18 and the XX DSD phenotype. Instead, these researchers revealed a copy number variation (CNV) region downstream of SOX9 (~ 500 kb) in four XX DSD pigs using fluorescence in situ hybridization (FISH), which was also observed in another two XX DSD pigs [29], this means it is likely a molecular marker for the XX DSD pig, but further validation from a larger cohort is needed.

Therefore, it is imperative to identify the causative gene responsible for XX DSD pigs and subsequently develop precise molecular diagnostic techniques based on this gene to identify and manage carriers. This will optimize the germplasm resources of breeding pigs. Furthermore, studying XX DSD pigs could provide crucial insights into the etiology of human DSD and enhance our comprehension of mammalian sex differentiation. Previous studies suggested that whole-genome sequencing and de novo assembly can detect more SNPs and SVs while improving the accuracy of variant detection [30, 31]. In this study, we performed whole-genome de novo sequencing on normal female pigs and XX DSD pig using 10× Genomics sequencing technology. Candidate variants obtained from genomic variant analysis were validated in a larger group of XX DSD pigs. These variants are expected to be molecular breeding markers for XX DSD pigs and provided a theoretical basis for elucidating the molecular background of XX DSD pigs.

Materials and methods

Animals

After euthanasia, the reproductive organs of XX DSD, normal female, and normal male pigs were dissected and compared. We collected partial gonadal tissues from 3 XX DSD pigs, 3 normal female pigs, and 3 normal male pigs. These tissues were rapidly frozen in liquid nitrogen and stored at -80 °C for further experiments. Additionally, some gonadal tissues were fixed in 10% neutral buffered formaldehyde, embedded in paraffin, and sectioned at a thickness of 5 μm for histological analysis. The sections were stained with hematoxylin and eosin (HE). Primers specific for internal reference gene GAPDH were designed using Primer 5 (Table S1). The previously published study provided us with SRY-specific primers [32]. The duplex PCR reaction was performed using the Multiplex PCR Kit (Vazyme Biotech, Nanjing, China, PM101) according to the manufacturer’s instructions. After PCR detection (lack of SRY gene), we selected one normal female pig (NF) and two XX DSD pigs (D1 and D2) at one month of age for whole-genome sequencing using genomic DNA.

We randomly collected 32 normal female pigs and 32 XX DSD pigs (Fig. S1) for the validation experiments of candidate genomic variant loci and immediately froze their ear tissues in liquid nitrogen. Subsequently, we stored the samples in a -80 °C freezer until DNA isolation.

10× Genomics library construction and sequencing

Using 10× Genomics sequencing technology, we perform WGS and de novo assembly on three Yorkshire pigs (NF, D1 and D2). To perform 10× Genomics sequencing, high-molecular-weight (HMW) genomic DNA was extracted from blood, indexed and barcoded according to the Genome Reagent Kit Protocol (10× Genomics). Briefly, in the 10× Genomics Chromium microfluidic Genome Chip, approximately 1 ng of HMW DNA was combined with master mix and partitioning oil to create Gel Bead-in-Emulsions (GEMs). During the GEM incubation, Read 1 sequence, 10× Genomic barcode, and random primer sequence were added. P5 and P7 primers, Read 2, and sample index were added during library construction after end repair, A-tailing, and adaptor ligation and amplification. Subsequently, the library was sequenced using 150PE mode on the Illumina HiSeq 2500 platform.

Genome assembly and evaluation

Raw reads were first processed using Trimmomatic version 0.38 for quality control to obtain clean reads [33]. Clean reads were de novo assembled with SOAP denovo (version v2.04) software [34], followed by gap filling. The assembly procedures comprised the following steps: splitting up the reads into k-mers, constructing a de Bruijn graph, producing contig sequences, realigning reads to contig sequences to construct scaffolds, and using GapCloser (version 1.12) to fill gaps within these scaffolds. The integrity of the assembled genome was evaluated using BUSCO version 5.2.2 based on the single-copy homologous gene set in OrthoDB (http://cegg.unige.ch/orthodb).

SNPs mining and validation

To identify SNPs, we aligned the assembled sequence to the reference genome (Sus scrofa 11.1) with LASTZ software (http://www.bx.psu.edu/~rsharris/lastz/). To exclude assembly errors, validation of SNPs was assisted by aligning reads to the reference and assembled genomes, respectively. Variant annotation was performed with Annovar (http://www.openbioinformatics.org/annovar/). SNPs were screened according to the autosomal recessive inheritance pattern with the following criteria: (A) Screening for homozygous nonsynonymous mutations only present in XX DSD pigs; (B) Screening for homozygous nonsynonymous mutations only present in normal female pig; (C) Screening for nonsynonymous mutations that are homozygous in XX DSD pigs and heterozygous in normal female pig; (D) Screening for heterozygous nonsynonymous mutations only present in normal female pig. The genes harboring the screened SNPs were subjected to Gene Ontology (GO) functional annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis using the clusterProfiler R package. We screened candidate SNPs based on gene information from public databases and from the literature.

For validation, we designed specific primers against the candidate SNPs and performed PCR amplification (Table S2). The PCR amplification products were then sent to Sangon Biotech (Shanghai, China) for Sanger sequencing. For gene expression analysis, total RNA was extracted from gonadal tissue using the FastPure Cell/Tissue Total RNA Isolation Kit V2 (RC112-01, Vazyme, Nanjing, China). RNA was isolated from a total of nine animals, divided into three groups: normal females (n = 3), XX DSDs (n = 3), and normal males (n = 3). cDNA was generated using the HiScript III All-in-one RT SuperMix Perfect for qPCR (R333-01, Vazyme, Nanjing, China). For quantitative real-time PCR (qRT-PCR), cDNA, primers, and the ChamQ Universal SYBR qPCR Master Mix (Q711, Vazyme, Nanjing, China) were mixed following the manufacturer’s instructions. Each qRT-PCR reaction was performed in triplicate for each individual. The relative expression levels of target genes were calculated using the delta-delta Ct method in Microsoft Excel. Primer sequences are provided in Table S3.

SVs mining and validation

SVs were identified using LASTZ software, following the same alignment method used for SNPs. SVs were screened using the following two steps: (A) Specific SVs unique to D2 when compared to NF were screened; (B) The SVs screened in the first step were further compared with those of D1, and only SVs with the same variant gene positions and types were selected. Then, we performed GO and KEGG enrichment analysis of the genes harboring SVs to screen for candidate SVs. The candidate SVs were validated by CNVplex assays according to the manufacturer’s instructions (GENESKY, China) [35]. The Actin beta gene (ACTB), Collagen type X alpha 1 chain gene (COL10A1), and glucagon gene (GCG) were served as internal reference controls for CNVplex assays. All specific primers of each candidate SVs were listed in Table S4. Briefly, the ligation reaction was performed in a 20 µl volume for each of the 32 control females and 32 XX DSD pigs, containing 10× ligase buffer (2 µl), ligase (0.5 µl), probe mix (1 µl), ddH2O (7.5 µl), and 120 ~ 200 ng genomic DNA. The program of ligation reaction was as follows: 4 cycles of 94 ºC for 1 min and 60 °C for 4 h; 94 ºC for 2 min; hold at 72 °C until 20 µl of 2 × Stop Buffer was added in. Then the ligation products were subjected to a multiplex fluorescence PCR amplification. The PCR reaction was performed in 20 µl for each sample, containing 2× PCR Master Mix (10 µl), probe mix (1 µl), ligation product (1 µl), and ddH2O (8 µl). The PCR program was as follows: 95 °C for 2 min; 30 cycles of 94 °C for 20 s, 57 °C for 40 s and 72 °C for 1.5 min; 60 °C for 60 min; 4 °C forever. PCR products were diluted 15-fold before being loaded on the ABI3730XL sequencer. Raw data were analyzed by GeneMapper 5.0 (Applied Biosystems, USA). The relative copy number of candidate SVs was determined by the ratio of the copy number of the target segment to the average copy number of the three reference genes as previously described [35].

In order to confirm the 70 bp deletion of the WWOX gene, we selected a pair of specific PCR primers (Forward: ATCCTCGCAGGACACAGGAG, Reverse: TGTGTAGCGGCCTCCAGAAG) to cover the 70 bp deletion for Sanger sequencing. The annealing temperature was 60 ℃ and the PCR products were separated into wild-type (298 bp) and deletion (228 bp) bands on a 2% agarose gel. The gonad tissue sections underwent deparaffinization, rehydration, and sequential heat-mediated antigen retrieval using EDTA buffer (pH = 9.0). Subsequently, the sections were blocked with 3% bovine serum albumin (BSA) in TBST for 30 min at room temperature. After blocking, primary antibodies were incubated overnight at 4 °C. The primary antibody used was rabbit anti-WWOX (15299-1-AP; Proteintech, Wuhan, China, 1:300). The next day, the sections were washed and incubated with the corresponding secondary antibodies for 50 min at room temperature. The secondary antibody utilized was goat anti-rabbit IgG labeled with Alexa Fluor 488 (GB25303; Servicebio, Wuhan, China, 1:400). Following incubation with the secondary antibodies, DAPI solution was applied to stain the nuclei. Finally, immunofluorescence images were captured using Fluorescent Microscopy (NIKON ECLIPSE C1, NIKON).

Results

Characteristics of XX DSD pigs and de novo genome assembly

XX DSD pigs display both boar and sow reproductive organ characteristics. Visual examination of XX DSD pigs revealed two scrotal masses resembling testes and an enlarged, malformed vulva (Fig. 1A). Anatomical examination of XX DSD pigs revealed the presence of Miillerian and Wolffian ducts derivatives in their internal genitalia. They had two oval-shaped gonads resembling testes and curly uterine horn (Fig. 1B). HE staining revealed the presence of primary oocytes in normal ovaries and seminiferous tubules in normal testes. Additionally, seminiferous tubules were observed in the testis-like gonads of XX DSD pigs. However, these XX DSD pigs had a lack of germ cells and a significant presence of empty vacuoles in the ducts (Fig. 1C). Molecular biology tests confirmed the absence of SRY gene in XX DSD pigs (Fig. 1D).

Fig. 1
figure 1

Characteristics of 1-month-old XX DSD pigs. (A) External genitalia characteristics of XX DSD pig. (B) Internal genitalia characteristics of XX DSD pig. Red arrow stands for ovary; yellow arrows stand for uterine horns; blue arrows stand for the vulva; purple arrows stand for testes; green arrows stand for the penis. (C) Histological analysis of the gonads in XX DSD pigs. HE staining was used to analyse the histological characteristics of the ovaries of normal female pigs, the testes of normal male pigs and the testis-like gonads of XX DSD pigs. Black stars indicate primary follicles; black triangles indicate the lumen of the seminiferous tubules. (D) Duplex PCR for SRY gene detection using GAPDH as an internal reference gene; M stands for the marker, RM stands for reference male, RF stands for reference female, NF stands for normal female pig, and D1 and D2 stand for XX DSD pigs

To identify genomic variants in XX DSD pigs, we constructed de novo genome assemblies using 10× Genomics linked-read technology and evaluated the assembly results. Table 1 shows the statistics of the de novo genome assemblies of each of the three individuals. The clean sequencing data for each individual exceeded 160 Gb, with an estimated average sequencing depth of > 60× based on pig genome sizes. The clean data of normal female (NF) was assembled into a genome of 2.46 Gb with a contig N50 length of 92.3 kb and scaffold N50 length of 5.46 Mb. The XX DSD pigs (D1 and D2) genomes were assembled with sizes of 2.47 Gb and 2.46 Gb, respectively. The contig N50 lengths for D1 and D2 were 82.2 kb bp and 93.3 kb bp, while their scaffold N50 lengths were 7.83 Mb and 7.30 Mb, respectively. Next, we evaluated assembly quality using BUSCO, with each sample exhibiting > 90% complete single-copy BUSCOs (Table 1). These results indicate high accuracy and completeness of the genome assembly. We used these genomes for the subsequent analysis of genomic variant.

Table 1 Summary of genome assembly results and evaluations

SNPs analysis

The genomes of NF, D1, and D2 were aligned with the reference genome to identify 11,549,414, 11,331,915, and 1,203,772 SNPs, respectively. Chromosomal distribution statistics showed similar patterns of SNPs among the three individuals (Fig. 2A). Analysis of SNP mutation types revealed that more than 95% of SNPs were located in intronic and intergenic regions (Fig. 2B). Among the SNPs located in the exonic regions, the majority were synonymous mutations (56.53%), followed by nonsynonymous mutations (42.76%) (Fig. 2C). According to the literature, we screened SNPs for nonsynonymous mutations using a recessive inheritance pattern, and a total of 5140 SNPs were screened (Fig. 3). It is worth mentioning that none of these SNPs were found to be located in the SOX9 gene. Next, the genes harboring these SNPs were analyzed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). These genes were enriched for the Steroid hormone biosynthesis signaling pathway and GO terms related to enzyme activity (Fig. S2). Candidate SNPs were carefully screened using GO and KEGG analyses, gene information from public databases, and gene functions and diseases associated with sexual developmental processes in the literature. After that, a total of 27 SNPs derived from 11 genes were ultimately identified as candidate loci, and their details are presented in Table 2.

Fig. 2
figure 2

Distribution characteristics of SNPs on chromosomes in three Yorkshire pigs (NF, D1, and D2). (A) The number of SNP distribution on chromosomes; (B) Distribution of SNPs on chromosomes. (C) Number of SNPs with different mutation types in the exon region

Fig. 3
figure 3

SNPs screen according to recessive inheritance pattern. (A) Screening for homozygous nonsynonymous mutations only present in XX-DSD pigs; NF_all stands for all non-synonymous SNPs in normal female pigs; D1_hom and D2_hom stand for homozygous SNPs with non-synonymous mutations in D1 and D2, respectively (same as below). (B) Screening for homozygous nonsynonymous mutations only present in normal females. NF_hom stands for homozygous SNPs with non-synonymous mutations in normal female pigs; D1_all and D2_all stand for all non-synonymous SNPs in D1 and D2, respectively (the same as below). (C) Screening for nonsynonymous mutations that are homozygous in XX-DSD pigs and heterozygous in normal female. NF_het stands for heterozygous SNPs with non-synonymous mutations in normal female pigs (the same as below). (D) Screening for heterozygous nonsynonymous mutations only present in normal female pigs

Table 2 Information on the 27 candidate SNPs used for the XX DSD pig cohort validation

The candidate SNPs were validated on larger cohorts (32 normal female pigs and 32 XX DSD pigs) by Sanger sequencing to further identify SNPs with genotype frequencies that differed between normal female pigs and XX DSD pigs. Validation results revealed the presence of heterozygous mutations in the XX DSD pig cohort despite our screening for SNP loci in a recessive inheritance pattern (Table S5). It is noteworthy that the SNP (c.218T > C) in the IFITM1 and the SNP (c.1043 C > G) in the NOBOX were not evenly distributed between the affected and control animals (Table 3). The TT and TC genotypes of the SNP in the IFITM1 and the CC and CG genotypes of the SNP in the NOBOX were exclusively found in XX DSD pigs. Here, the SNP of IFITM1 gene changed the polar amino acid threonine (Thr) to the nonpolar amino acid isoleucine (Ile). In addition, the hydrophobic alanine (Ala) changed to hydrophilic glycine (Gly) due to the SNP of NOBOX (Fig. 4AB). To further understand the expression levels of IFITM1 and NOBOX genes in gonadal tissues, we used qRT-PCR to determine their relative expression levels. A significant difference existed in the expression levels of these two genes between affected and control gonads. The expression of IFITM1 mRNA in the gonads of XX DSD pigs was significantly higher than that in normal female and normal male pigs (Fig. 4C). Additionally, NOBOX mRNA expression in the gonads of XX DSD pigs was significantly lower compared to normal female pigs. Notably, NOBOX mRNA was scarcely expressed in the gonads of XX DSD pigs and normal male pigs. For the other genes suggested by WGS in XX DSD, no significant associations were found in the larger cohort validation because of the similar distribution of SNP variants in normal female pigs and XX DSD pigs (Table S5).

Table 3 Genotype and SNP variant distribution of IFITM1 and NOBOX obtained by Sanger sequencing
Fig. 4
figure 4

Comparison of SNPs and relative mRNA expression of candidate genes. (A) The SNP (c.218T > C) in the IFITM1. (B) The SNP (c.1043 C > G) in the NOBOX. (C) The relative expression of IFITM1 and NOBOX mRNA was detected by qRT-PCR (n = 3 for each group; ** P < 0.01, ns P > 0.05)

SV analysis

The assembled genomes of NF, D1, and D2 identified 10,756, 11,364, and 10,223 SVs, respectively. Statistical analysis indicated that the distribution pattern of SVs on chromosomes was similar to that of SNPs, with deletions and duplications being the most common SV types and inversions the least common (Fig. 5). NF served as the control and was compared with D1 and D2 to identify SVs with the same location and variant type. GO and KEGG analyses were performed on the 1,474 genes where these SVs were located (Fig. S3). The GO analysis showed that genes harboring SVs were mainly enriched in signaling and signaling receptor activities. The KEGG analysis results showed that these genes were enriched in steroid hormone biosynthesis, cAMP signaling pathway, WNT signaling pathway, and MAPK signaling pathway. Similar to the screening method for candidate SNPs, 14 candidate SVs were finally screened for the further study (Table 4).

Fig. 5
figure 5

Characteristics analysis of SVs in three Yorkshire pigs (NF, D1, and D2). (A) Distributions of SVs on chromosomes. (B) The proportion of different SVs

Table 4 Information on the 14 candidate SVs used for the XX DSD pig cohort validation

We used CNVplex assay to validate candidate SVs in a cohort containing 32 normal female pigs and 32 XX DSD pigs. The validation results demonstrated that six SVs displayed altered copy numbers in the XX DSD pig cohort (Fig. 6A). However, the copy numbers of these six SVs remained unchanged in the vast majority of normal female pigs (Table S6). Specifically, 62.5% of XX DSD pigs had a 70 bp deletion in intron 5 of the WWOX gene, either with 0 or 1 copy of the deletion, while 18.75% of XX DSD pigs exhibited CNVs in the downstream region of the SOX9 gene (~ 70 kb). Additionally, 21.88% of XX DSD pigs showed copy number increase or decrease on the SV of chromosome 3 (231801–479100 bp), covering the genes PDGFA, PRKAR1B, and DNAAF5. Interestingly, in contrast to the WGS results, 31.25% of XX DSD pigs showed an increase in the copy number of the pituitary homeobox paired homeodomain transcription factor 1 gene (PITX1). Among the XX DSD pig cohort, 22 individuals had only one SV, while another 10 individuals had two or more SVs. (Fig. 6B). Specifically, 14 of these individuals presented just the WWOX deletion, while four individuals exclusively had the PITX1 gene variant. Moreover, two individuals harbored solely the CNV of Src homology 2 domain containing transforming protein l (SHC1), and another two individuals possessed only the SV of chromosome 3 (located at 231801–479100 bp). However, XX DSD pigs with WNT6 or the upstream region of the SOX9 gene variants usually carried additional variants.

Fig. 6
figure 6

Validate candidate SVs in the XX DSD pig cohort. (A) The six SVs displayed altered copy numbers in the XX DSD pig cohort. (B) A total of 22 individuals with only one SV were detected in the XX DSD pig cohort. SOX9 stands for CNV in the region upstream of the SOX9 gene

Fig. 7
figure 7

PCR and Sanger sequencing validated the 70 bp deletion of the WWOX gene. (A) Gel electrophoresis of PCR products derived from normal female pig exclusively displayed wild-type bands (referred to as band A, 298 bp). Conversely, in XX DSD pigs, discernible deletion bands surfaced, encompassing homozygous (referred to as band B, 228 bp) as well as heterozygous (228 bp and 298 bp) deletions. (B) The sanger sequencing results showed a 70 bp deletion between the wild type and homozygous type. The sequencing results of the heterozygous bands were manually corrected to show two strands with a 70 bp difference. Black boxes indicate the 70 bp deletion. WT indicates wild-type, Het indicates heterozygous, and Hom indicates homozygous

We futher employed PCR and Sanger sequencing techniques to validate the 70 bp deletion of the WWOX gene. The PCR product from normal female pigs showed a single band (band A, wild type). However, in XX DSD pigs, either one band (band B, homozygous deletion) or two bands (band A and band B, heterozygous deletion) were discerned (Fig. 7A). The outcomes obtained through Sanger sequencing exhibited a disparity of 70 base pairs between the sizes of band A and band B (Fig. 7B). This discrepant size firmly substantiates the presence of the WWOX gene deletion in XX DSD pigs. We proceeded to examine the expression of WWOX in gonadal tissues using the immunofluorescence staining method. The results revealed high expression of WWOX in the cytoplasm of ovarian follicular cells in normal female pigs. However, in the gonadal tissues of XX DSD pigs and normal male pigs, the expression of WWOX was almost imperceptible (Fig. 8). This finding implies that the WWOX gene plays a pivotal role in ovarian development, and the low expression of WWOX in XX DSD pigs is likely related to a 70 bp deletion in intron 5 of the WWOX gene.

Fig. 8
figure 8

Comparison of WWOX immunofluorescence staining in gonads of XX DSD, normal female and normal male pigs. Bar = 50 μm

Discussion

Sex development relies on the sex-specific action of gene networks to differentiate the bipotential gonads of the growing fetus into testis or ovaries. DSD arises from congenital alterations during any of these processes; thus, elucidating the molecular background of DSD is a very challenging task [3]. Whole-genome sequencing is a comprehensive method for detecting disease-related genetic variants at the whole-genome level, including SNPs and SVs. It has been recommended for diagnostic studies of human DSD [36]. The second-generation sequencing lacks sensitivity and accuracy for detecting SV due to its short read length. The linked read sequencing library preparation platform by 10× Genomics produces barcoded sequencing libraries, which are subsequently sequenced using the Illumina short read sequencing technology. Large DNA fragments are partitioned into independent micro-reactions, where the same index sequence is added into each sequencing fragment derived from a specific long fragment. De novo assembly of the sequencing data obtained by this method results in highly accurate genomic sequences with longer read lengths, improving accuracy and performance in detecting SVs [37, 38]. In our study, we used 10×Genomics technology to sequence the genomes of one normal female pig and two XX DSD pigs, obtaining a total of 517 Gb of clean data with an average sequencing depth greater than 60×. After de novo assembly, the genome size of all three samples exceeded 2.46 Gb, with a complete single-copy BUSCO score of more than 90%. Additionally, Jiang et al. [39] reported that the resequencing of Yorkshire pigs achieved over 99% genome coverage when the sequencing depth was 10×. These results indicate that the assembled genomes have high coverage and assembly quality, sufficient for subsequent genomic variant analysis.

A recessive inheritance pattern was applied to examine SNPs for nonsynonymous mutations, followed by screening candidate SNPs for validation in a cohort of XX DSD pigs. Our findings demonstrated that two SNPs, namely IFITM1 (c.218T > C) and NOBOX (c.1043 C > G), were associated with XX DSD, yet both SNPs exhibited heterozygous mutations in the XX DSD pig cohort. Thus, the inheritance pattern of XX DSD pigs warrants further investigation through family populations. Additionally, these two SNPs only appear in XX DSD pigs and are promising molecular markers for the phenotype of XX DSD pigs.

The Interferon-induced transmembrane protein family (IFITMs) have been found to play a significant role in inhibiting cell proliferation, promoting homotypic cell adhesion, and mediating germ cell development [40]. Mouse IFITM1, which is expressed in primordial germ cells (PGCs), is implicated to have roles in the migration and development of PGCs [41]. The IFITM1 gene represents a downstream target of the WNT/β-catenin signaling pathway, and interference with IFITM1 can cause embryonic deformities [42]. WNT/β-catenin signals are necessary for normal ovarian development from the embryonic XX gonad [3]. HE staining in this study showed a lack of germ cells in the hypoplastic testes of XX DSD pigs. Previous studies have also shown that the hypoplastic testicular cords of XX DSD pigs contain germ cells at fetal stages; however, these germ cells all die within a few weeks after birth [7]. The aforementioned postnatal germ cell death phenomenon has also been seen in female mice with the NOBOX gene knockout [43]. Furthermore, several studies have identified NOBOX gene mutations as a significant cause of primary ovarian insufficiency [44, 45]. Qin et al. [46] discovered that the homozygous p.R355H mutation in the NOBOX gene disrupts NBE binding and affects NOBOX expression. NOBOX gene mutations inhibit the expression of downstream genes such as Octamer binding protein 4 (OCT4) and Growth differentiation factor 9 (GDF9), which are linked to germ cells’ development [47, 48]. Li et al. [49] discovered a homozygous truncating mutation in the NOBOX gene using exon sequencing, which caused G2/M phase arrest in 293FT cells. The results of this study showed that IFITM1 mRNA was significantly highly expressed in the gonads of XX DSD pigs, whereas NOBOX mRNA was hardly expressed. Therefore, we hypothesise that SNPs in IFITM1 (c.218T > C) and NOBOX (c.1043 C > G) may be involved in the gradual death of germ cells in XX DSD pigs by influencing gene expression. It is noteworthy that sex determination occurs during embryonic development, involving the bipotential gonad dependent on gene expression and suppression to differentiate into either testes or ovaries. Therefore, further research is needed to investigate the expression patterns of potential pathogenic genes during embryonic sex determination, in order to elucidate the relationship between gene variants and the phenotype of XX DSD pigs.

A CNV region was identified in the SHC1 gene of XX DSD pigs by WGS analysis. As one of the transcription factors of the PI3K/AKT signaling pathway, SHC1 is involved in oocyte maturation [50] and insulin/IGF-1-induced Ras-dependent oocyte maturation [51]. SVs validation confirmed an increased copy number of SHC1 in only two XX DSD pigs. Further investigation is needed to understand the relationship between this CNV region and XX DSD development. The WNT6 can activate the WNT/β-catenin signaling pathway and is involved in the growth and development of various organs and tissues, including embryonic cartilage [52], uterine [53], and spermatogenesis [54, 55]. Whole-genome sequencing detected a deletion region in WNT6. SVs validation showed that only two XX DSD individuals had the deletion of WNT6, but they also carried other SVs, including either a deletion of intron 5 of the WWOX gene or a deletion on chromosome 3 (spanning 231801–479100 bp). Hence, it is uncertain whether the deletion region of WNT6 is involved in porcine DSD development. In XX DSD pigs, a deletion involving PDGFA, PRKAR1B, and DNAAF5 on chromosome 3 spans 247,300 bp. Validation of SVs showed that 21.88% of individuals had SV polymorphisms in this region. However, the effect of this deletion remains unclear.

SOX9 is one of the most significant genes in the XX DSD in mammals. Activated downstream of SRY, SOX9 is a key player in the testiculogenesis pathway. In humans, the sex determining region (RevSex) located upstream of the SOX9 gene has two sex reversal enhancers, Sex reversal enhancer-A and Sex reversal enhancer-B. Studies indicate that mutations in these two enhancers can lead to DSD [20]. Several studies have investigated the role of SOX9 in pigs, to explore the molecular basis of XX DSD [8, 27, 28]. WGS analysis of XX DSD pigs identified a CNV region (8562001 ~ 8572600 bp) located downstream of the SOX9 gene (~ 70 kb). However, validation of the SVs revealed that only 18.75% of the XX DSD pigs have this CNV, and this CNV appears in different copy numbers in the cohort. It should be pointed out that our data did not find the CNV marker region of XX DSD pigs mentioned in previous studies, which is located downstream of the SOX9 gene (~ 500 kb) [28]. The CNV located upstream of the SOX9 gene has been considered to cause some XX DSD cases in humans [20] and dogs [56]. Parma et al. [57] suggested that various mutations in the SOX9 gene might lead to XX DSD in pigs. However, given that XX DSD pigs harboring the downstream CNV (8562001 ~ 8572600 bp) of SOX9 also harboring other SVs, this CNV region alone is still insufficient to explain the pathogenesis of XX DSD pigs and needs to be validated further.

In pituitary gonadotropin cells, PITX1 regulates the synthesis of reproductive hormones and the expression of related genes, including LHβ, FSHβ, and GNRHR [58]. WGS sequencing results revealed a 8700 bp deletion of PITX1. However, the validation of SVs showed the copy number of this SV had varying degrees of increase, suggesting that PITX1 variant is related to individual heterogeneity. Furthermore, PITX1 activates the SOX9 promoter through a unique binding motif that induces astrocyte differentiation [59]. The synergy between PITX1 and SOX9 can also regulate skeletal development [60]. However, the regulatory role of PITX1 on SOX9 during sex development requires further investigation.

The WWOX gene encodes a WW-domain-containing oxidoreductase that acts as a transcriptional regulator and plays an essential role in various biological processes, including tumor suppression, cell proliferation, apoptosis induction, steroid metabolism, and central nervous system development [61]. WWOX expression is particularly high in the pituitary and gonads, where it is involved in the synthesis of gonadotropins and steroidal sex hormones [62, 63]. Studies on WWOX knockout mice have shown that WWOX abnormalities lead to gonadal dysfunction [64, 65], indicating its fundamental role in gonadal development. Mahmud et al. [66]also suggested a potential involvement of WWOX in testicular development and spermatogenesis through interaction with Golgi proteins in the Golgi apparatus of testicular cells. The association of WWOX with XY/XX DSD has been widely researched in humans and dogs. In humans, a heterozygous deletion spanning WWOX exons 6 to 8 was found to be associated with 46, XY DSD [67]. Additionally, another study documented the presence of SNP or CNV in WWOX exons among 46 XY DSD individuals, with SNP being identified in 46 XX DSD individual [68]. In dogs, copy number variants in exon 4 of WWOX were identified in XY DSD [2]. In our study, we found a heterozygous or homozygous deletion of 70 bp in intron 5 of WWOX gene in 62.5% of XX DSD pigs. Moreover, WWOX was highly expressed in the ovaries of normal female pigs and almost absent in the gonads of XX DSD pigs. This suggests that the deletion may be involved in abnormal gonadal development of XX DSD pigs. It has been previously speculated that XX DSD pigs may result from polygenic variants [6, 7]. Our SVs validation results found that of the 32 XX DSD pigs, 22 individuals contained only one SV and 10 individuals contained two or more SVs, suggesting that XX DSD may result from multiple mutations. It is noteworthy that the deletion of intron 5 of the WWOX gene can be used as one of the molecular markers for XX DSD pigs. We will further investigate the relationship between the 70 bp deletion in intron 5 of WWOX and XX DSD pigs.

Conclusions

In sum, we found two specific SNPs, one for the IFITM1 gene (c.218T > C) and another for the NOBOX gene (c.1043 C > G), found only in XX DSD pigs. It is hypothesized that these SNPs may be linked to germ cell apoptosis in XX DSD pigs. Additionally, we screened 14 candidate SVs associated with sex development, using normal sows as controls. Further validation indicated that 62.5% of the XX DSD pigs carried a 70 bp deletion fragment in intron 5 of the WWOX gene, which is presumed to be associated with abnormal gonadal development in XX DSD pigs. Further investigations will be carried out to delve more deeply into these findings. This study lays the foundation for studying the molecular background of XX DSD pigs and provides important reference material for studying DSD in other mammals.