Introduction

Hyperpigmentation in animals has long piqued the curiosity of evolutionary and developmental biologists, offering valuable insights into melanocyte development and hyperpigmentation-related disease research [1,2,3]. Recently discovered in Lanping County, China, Lanping black-boned sheep (LPB) exhibit striking hyperpigmentation in their skin and internal tissues, resembling the world-renowned silky fowl [4]. They are unique among mammals, displaying extensive pigmentation throughout their bodies, including the eye conjunctiva, muscles, periosteum, kidney, heart, lung, liver, and trachea. This distinctive trait makes them highly prized for their superior meat quality, particularly appealing to oriental preferences. LPB are found emerging from Lanping normal sheep (LPN), which lacks hyperpigmentation in its skin and internal organs, like most sheep breeds. However, the exact genetic lineage of LPB, its relationship with LPN, and the underlying genetic mechanisms responsible for hyperpigmentation in LPB remain poorly understood.

Melanocytes and the melanin they produce play critical biological roles, including protection against UV radiation, thermal regulation, and camouflage [5]. Dysfunction of melanocytes can lead to various diseases such as albinism, melanoma, microphthalmia, and Waardenburg syndrome [6]. Melanocytes produce two types of melanin: eumelanin and pheomelanin. Melanin synthesis involves a series of complex enzymatic and biochemical reactions. The key regulatory factors in melanin production are tyrosinase (TYR), tyrosinase-related protein 1 (TYRP1), and tyrosinase-related protein 2 (TYRP2), with TYR playing a particularly crucial role. Melanin synthesis begins with the oxidation of L-tyrosine and/or L-dihydroxyphenylalanine (L-DOPA) to dopaquinone (DQ), which serves as the substrate to produce both eumelanin and pheomelanin. Dopaquinone undergoes polymerization to form colorless dopachrome. Dopachrome is highly unstable and is rapidly oxidized to dopachrome by another dopaquinone molecule. Under the action of TYRP2, dopachrome is hydroxylated to 5,6-dihydroxyindole-2-carboxylic acid or spontaneously decarboxylated to 5,6-dihydroxyindole. Subsequently, 5,6-dihydroxyindole is catalyzed to 5,6-indolequinone-2-carboxylic acid by TYRP1. 5,6-dihydroxyindole is catalyzed by TYR to form 5,6-indolequinone, which, along with 5,6-indolequinone-2-carboxylic acid and 5,6-indolequinone, ultimately contributes to eumelanin formation. In the presence of cysteine or other thiol-containing compounds, dopaquinone reacts with cysteine to form cysteine-dopachrome, which is further oxidized to pheomelanin. Among these enzymatic reactions, TYR acts as the rate-limiting enzyme in melanin synthesis, as all other reactions can occur under physiological pH conditions [7]. The process of melanin synthesis in melanocytes is regulated by autocrine and paracrine factors. During this process, the microphthalmia-associated transcription factor (MITF) serves as a central regulator by upregulating the two key enzymes, TYR and TYRP1, to promote melanin production. Studies have shown that paracrine factors secreted by keratinocytes, such as MC1R, EDN, and SCF, are major regulators of melanin synthesis. In the skin, in addition to keratinocytes, fibroblasts and immune cells in the dermis also regulate melanin synthesis in melanocytes through paracrine factors [8]. Research on melanocytes has been continuous over the past few decades. To date, studies have primarily focused on the formation, localization, and melanin synthesis of melanocytes, as well as the key cytokines and signaling pathways involved at different stages. These investigations have progressively elucidated the molecular mechanisms underlying melanocyte development and their biological functions, providing a theoretical basis for related disease research and valuable insights for studies on economically significant traits associated with pigmentation in animals.

To date, more than 688 genes associated with pigmentation have been identified [9] (last update: 8 December 2022), These genes encompass melanocyte development-related genes such as MITF, SOX10, PAX3, WNT, EDN3, EDNRB, KIT, MC1R, and melanogenic genes including TYR, TYRP1, DCT, and PEML [6, 10]. In several studies, gene duplication of Endothelin 3 (EDN3) has been identified as the cause of hyperpigmentation in silky fowl [1, 3] and other black-boned chicken breeds, like the Ayam Cemani in Indonesia [11]. However, a study by Darwish et al. [12] challenged this association, revealing that END3 duplication is not linked to hyperpigmentation in LPB. Notably, pigmentation in LPB and silky fowl followed a similar pattern of progressively darkening with age. While no pigmentation was observed at birth, it became visible at around one year of age and pronounced at approximately two years of age. Interestingly, melanin content in the liver of LPB was higher than in other tissues, including the trachea, lung, heart, kidney, and skin, but it was less pronounced and sometimes undetectable in the liver tissue of silky fowl [5, 13]. This highlights distinct mechanisms underlying hyperpigmentation in LPB compared to silky fowl. Although several pigmentation-related genes, such as TYR [14], TYRP1 [15], TYRP2 [16], and MC1R [17], have been explored for their potential involvement in LPB hyperpigmentation, but no conclusive links have been established.

In this study, we aimed to provide a comprehensive understanding of the genetic phylogeny and genetic basis associated with hyperpigmentation in LPB. We achieved this by conducting whole-genome sequencing of 100 LPB and 50 LPN, followed by the integration of this data with a dataset encompassing 421 counterparts representing seven wild and 64 domestic sheep breeds from diverse geographic regions. Additionally, RNA-Seq analysis of liver tissues at two developmental stages was employed to investigate genes associated with melanocyte development and melanogenesis. Through a meticulous combination of genomic and transcriptomic analyses, we identified several potential gene implicated in hyperpigmentation.

Materials and methods

Samples collection

LPB and LPN are both local sheep breeds from Lanping County, Nujiang Prefecture, Yunnan Province, China. They share similar body size and appearance characteristics, with no apparent differences. The distinguishing feature between them lies in LPB having black trait such as black bones and black meat, while LPN lacks these traits. Both populations are raised under consistent conditions, and have no genetic relationship.

Among the 5 local sheep farms in Lanping County, a total of 100 LPB (half male and half female) with no genetic relationship, particularly obvious black-bone traits were selected from the 443 adult LPB, while 50 LPN (half male and half female) with no genetic relationship were selected from the 298 Lanping indigenous sheep. Each sheep was verified to meet the breed standards. Blood samples were drawn from the jugular vein and promptly frozen at -20 °C for further analysis.

The LPB is currently the only known mammal with the black-trait, which is absent in other sheep breeds. To improve the efficiency of screening functional genes related to the black trait, we additionally included 421 sheep of diverse backgrounds. These data encompassed 37 wild sheep individuals from seven different breeds, including Mouflon, Urial, Argali, Bighorn, Thin horn, and Barbary. Furthermore, there were 384 domestic sheep samples, comprising 247 individuals from 14 different breeds in Tibet, Northern and Eastern China, and 137 individuals from 50 diverse breeds across South Asia (SA), Europe, Africa, the Middle East (ME), and Oceania. These additional samples were obtained from 227 publicly available datasets from the NCBI database and 194 datasets shared by Dr. Yu Jiang from Northwest A&F University (Supplementary table S1).

For transcriptome analysis, a total of 23 sheep were slaughtered, including 6 LPB (3 male and 3 female) and 5 LPN (2 male and 3 female) at 2 months of age, and 6 LPB (3 male and 3 female) and 6 LPN (4 male and 2 female) at about 2 years of age. Liver tissue from each individual was sampled for RNA-Seq.

Whole-genome sequence and SNPs calling

Total genomic DNA from 150 blood samples were extracted using the Takara Blood Genome DNA Extraction Kit following the manufacturer’s guidelines. DNA quality and quantity was examined using a NanoDrop device and 1% agarose gel electrophoresis. Paired-end sequencing libraries were generated with an insert size of about 350 bp, according to the manufacturer’s instructions, and sequenced on Illumina Hiseq X Ten platform to an average raw read sequence coverage of ~ 10X. After removing the low-quality raw reads (N content ratio > 10%, low quality base ratio > 50%), high quality reads were mapped to the sheep reference genome assembly Oar_v4.0 using Burrows-Wheeler Aligner (BWA v0.7.15). Picard tools (v1.119) was utilized to identify and remove duplicate reads from the BAM file. After mapping, SNPs were called by GATK (v3.7.0) [18] from the BAM file and the output SNPs were further filtered using vcftools (v4.2) [19]. Raw SNPs were filtered using the following criteria, Quality Depth (QD) < 2.0, root mean square of Mapping Quality (MQ) < 40.0, Fisher Stand (FS) > 60.0, HaplotypeScore > 13.0, and MQRankSum ≤ 12.5, and biallelic SNPs with no more than 10% missing rate. Filtered SNPs were further annotated by ANNOVAR [20] based on the gene annotation of the sheep reference genome Oar_v4.0 and then classified as variations in exonic, intronic, intergenic, upstream, downstream, 5’UTR, 3’UTR, and splicing sites.

Population Structural analysis

Principle component analysis (PCA) was performed based on autosomal SNPs for 571 individuals using GCTA (v1.92.4 beta) [21]. PLINK (v1.9) [22] was used to prune SNPs in linkage disequilibrium in a sliding 100-SNPs window at 10-SNPs steps. After pruning, 302,088 representative SNPs of 571 sheep were extracted for NJ-tree construction and structure analysis. NJ-tree was constructed based on a matrix of pairwise genetic distances using PHYLIP (v3.698) [23] and visualized using the online software iTOL (https://itol.embl.de/). Admixture (v1.2.3) [24] was used to assess the admixture proportions with the default setting; the number of possible genetic clusters K ranged from 2 to 8.

Introgression analysis

Based on the Patterson’s D-statistic (ABBA-BABA test) [25], we used the Pythonscript ABBABABAwindows.py [26] to compute the introgression among two populations (LPB, and LPN). Windows with absolute D-statistic values close to zero (e.g., low 1% windows) were regarded as the genomic regions least affected by gene flow, whereas the highest D-statistic values (e.g., top 1% windows) were regarded as those most affected by gene flow.

Selective sweep analysis

We calculated the genome-wide distribution of FST and LSBL values between LPB and none black-bone domestic sheep population with a 20 kb sliding window using vcftools (v4.2) [19], and the Cross Population Extended Haplotype Homozygosity (XPEHH) values was calculated using Selscan (v2.0.0) [27] .The top 1% of values for the FST, LSBL, and XPEHH were considered as the candidate regions under strong selective sweeps and visualized in the R package. All the candidate regions were assigned to corresponding SNPs and genes using in-house scripts. We further performed KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and GO (Gene Ontology) terms analysis using the online software g: Profiler [28], and p < 0.05 was used as the threshold for significantly enriched pathways and GO terms [29].

RNA-Seq analysis

RNA-Seq was performed on the liver tissues of 6 LPB and 5 LPN at the pigmentation-not-visible-stage, and in 6 LPB and 6 LPN at the pigmentation-evident stage. The total RNA of 23 liver samples was isolated using Trizol RNA Reagent (Takara), according to the manufacturer’s instructions. The integrity of total RNA was assessed by agarose gel (1%) electrophoresis, prepared in the same manner as for DNA detection. Electrophoresis was conducted at 100 V for 15 min. For concentration and purity analysis, 1 µL of total RNA solution was diluted with 99 µL of sterile water, using sterile water as a blank control. Measurements were performed using a UV-visible spectrophotometer. Then, mRNA libraries were constructed using a NEBNext Ultra Directional RNA Library Prep Kit. The mRNA libraries were then sequenced on an Illumina Hiseq X Ten platform with 100 bp paired-end reads. The raw reads were filtered by removing the low-quality reads and then mapped to the sheep reference genome Oar_v4.0 using TopHat (v2.0.9) [30]. The expression abundance of each assembled transcript was measured through Fragments per Kilobase of exon model per Million mapped reads (FPKM) values. The edgeR 4.0 was used to test the significance of inter group differences in RNA-Seq data processing. The genes that exhibited |log2(fold change) | ≥ 1 and adjusted p ≤ 0.05 in the comparisons between LPB and LPN individuals from two pigmentation stages were considered as differentially expressed genes (DEGs). KEGG pathway and GO term enrichment analysis of differentially expressed genes were performed using the online OmicShare tools.

qRT-PCR

Total RNA was isolated from liver, spleen, and kidney tissues of 6 LPB and 6 Diqing sheep individuals, utilizing TRIzol reagent (Thermo Scientific). cDNA synthesis was achieved with the PrimeScript™ RT Reagent Kit with gDNA Eraser (Takara), and was synthesized by Sangon Biotech (Shanghai) Co., Ltd. Uniform cDNA concentration of 1ug/ul during reverse transcription. Quantitative RT-PCR (qRT-PCR) experiments were conducted with TB Green® Premix Ex Taq™ Takara, the cycle program of qRT-PCR consisted of pre-denaturation (95°C, 3 min), denaturation (95°C, 30 s), annealing (59°C, 30 s) and extension (72°C, 30 s), finally melt curve stage (Ramp rate: +0.5°C /5 s). Simultaneous amplification efficiency E in the range of 90%-110%. Raw data were processed with the ∆∆ Ct method (Kenneth & Thomas, 2001), where ∆Ct = Ct (gene of interest) – Ct (housekeeping gene); ∆∆Ct = ∆Ct (treated sample) – ∆Ct (untreated sample). Primer sequence (5’-3’) for the ERBB4 gene were forward: CGGAACAGTGTGATGGCAGA, reverse: TGTGGGCACTTCTTGACACA, and for ROR1 gene (forward: ACAGAGTTGTCAGTCACTGCT, reverse: CTAATCCGCAGTCGAGAGCC), and reference gene is GAPDH (forward: GCCCTCAACGACCACTTTGT, reverse: TCGGGAGATTCTCAGTGTGG).

Results

Genome sequencing and variations

We conducted genome sequencing for a cohort of 100 LPB sheep and 50 LPN sheep (Supplementary table S1), resulting in a substantial dataset comprising 27.69 billion raw paired-end reads. This dataset yielded an average sequencing depth of 9.30X per individual, with an average genome coverage of 96.47% (Supplementary table S2). To ensure a comprehensive genomic comparison, we augmented our dataset by merging it with a larger collection of whole-genome sequences. This expanded dataset included individuals from seven wild sheep breeds (n = 37 sheep) and 64 domestic sheep breeds (n = 384 sheep), originating from diverse regions, including Tibet, Northern and Eastern China, South Asia, the Middle East, Africa, Europe, and Oceania [31, 32] (Supplementary table S1). The augmented dataset exhibited an average coverage depth of 9.70X per individual.

Following stringent variant filtering criteria, we identified a total of 97,944,357 high-quality single nucleotide polymorphisms (SNPs) across the entire cohort of 571 sheep (Supplementary table S3). These SNPs were categorized into various functional classes, with the majority falling into intergenic regions (65.15%) and intronic regions (32.71%) (Supplementary figure S1). Notably, LPB displayed a lower SNP density (11.81/1000 bp) and nucleotide diversity (π) (2.50 × 10− 3) in comparison to non-black-boned sheep, which encompassed LPN and 64 domestic sheep breeds (exhibiting 19.88 SNPs per 1000 bp and 2.94 × 10− 3 nucleotide diversity) (Supplementary table S4).

Genetic relationship of LPB with LPN and worldwide sheep breeds

We delved into the genetic relationships of LPB with other sheep breeds in a global context. To accomplish this, we employed Principal Component Analysis (PCA) (Fig. 1A), Neighbor-Joining (NJ) phylogeny (Fig. 1B), and ADMIXTURE analysis (Fig. 1C), all of which were based on the extensive dataset of whole-genome SNPs.

The PCA plot and NJ-tree unveiled a distinct separation of Lanping local sheep, comprising both LPB and LPN, from domestic and wild sheep breeds. However, they exhibited a closer genetic affinity with Chinese sheep populations, particularly Tibetan sheep, and those from Northern and Eastern China. This observation suggests that Lanping sheep can be regarded as a sub-population within the broader Chinese sheep genetic landscape. Tibetan sheep occupied an intermediate position between Lanping sheep and Northern and Eastern Chinese sheep, and have strong genetic affinity with the latter two. ADMIXTURE analysis further substantiated this relationship, revealing that Tibetan and Northern and Eastern Chinese sheep exhibited higher levels of shared ancestral components than they did with LPB and LPN when analyzed at K = 4. This aligns with a prior study, which reported a closer genetic connection between Tibetan and Northern and Eastern Chinese sheep in contrast to the relationship between Tibetan sheep and Yunnan-Kweichow sheep populations [31].

Fig. 1
figure 1

Population genetic analysis of LPB and 9 subgroups of 571 domestic and wild sheep individuals from around the world.A: Principal components 1 and 2 for 571 individual sheep. B: Neighbor-joining phylogenetic tree of 571 individual sheep. C: Population structure of individual 571 with K = 2–8. The abbreviations and subgroups of the breeds and individuals are shown in Supplementary table S1

The Principal Component Analysis (PCA) plot did not exhibit a clear demarcation between LPB and LPN populations. However, the Neighbor-Joining (NJ) phylogenetic tree revealed that while LPB and LPN clustered together in one clade, they further separated into two distinct sub-clusters. Notably, within this sub-cluster delineation, ten LPB individuals displaying hyperpigmentation traits were found to be nested within the LPN branch (Fig. 1B). This intriguing observation raises the possibility that these ten individuals may represent hybrid progeny resulting from interbreeding between LPB and LPN.

ADMIXTURE analysis also supports the results of the PCA and physiological tree mentioned above (Fig. 1C), and the admixture levels and genetic components indicated a high ancestral coefficient and a strong genetic resemblance between the two populations. Nevertheless, a more nuanced perspective emerged when assessing genetic divergence through pairwise fixation index (FST) calculations. The FST value of 0.0210 between LPB and LPN surpassed the genetic differentiation observed between three separate breeds of Tibetan lineage, with FST(TIB/OLA) = 0.0150, FST(TIB/PRT) = 0.0184, and FST(PRT/OLA) = 0.0154 (Table 1). Therefore, while LPB and LPN displayed robust genetic similarities, the notable genetic divergence, as indicated by the FST value, suggests that LPB is indeed derived from LPN and has evolved into a distinct breed.

Table 1 Population divergence measured as FST

Genomic signatures associated with hyperpigmentation in LPB

To identify genomic signatures associated with hyperpigmentation in LPB, we conducted a comprehensive analysis of the genome. We assessed allele frequency differentiation by calculating FST between the LPB population and none-black-boned domestic sheep. Using a window size of 20 kb and a step size of 10 kb, we detected a total of 576 putatively positively selected genes (PSGs) based on empirical distributions (with a cutoff of the top 1st percentile) (Fig. 2, Supplementary table S5).

Gene Ontology (GO) enrichment analysis of these PSGs revealed an enrichment in processes related to development. Additionally, a significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the MAPK signaling pathway, was identified, which is crucial for melanocyte development and melanogenesis. This pathway includes 16 genes: MAP3K1, ANGPT1, DUSP1, TGFB3, NFKB1, FGF7, ERBB4, KIT, NF1, MAPKAPK5, FGFR4, PTPN5, TP53, MAP3K12, FGFR1, MAP4K4 (Supplementary table S6 and S7).

Furthermore, we employed locus-specific branch length statistics (LSBL) to analyze the population of LPB, none black-boned sheep, and wild sheep using LPB as the target population. By applying a window size of 20 kb and a step size of 10 kb, we identified 504 genes as PSGs based on empirical distributions (with a cutoff of the top 1st percentile) (Fig. 2, Supplementary table S8). KEGG and GO terms analysis of these PSGs revealed 4 pathways and 39 terms that were over-represented (Supplementary table S9 and S10). The MAPK signaling pathway, which includes 14 genes (ANGPT1, IL1R1, TGFB3, RASGRF2, PAK1, MECOM, ERBB4, KIT, NF1, MAPKAPK5, FGFR4, PAK2, MAP3K7, MAP4K4), was one of the pathways identified.

Finally, we employed the XPEHH test to analyze the LPB and non-black-bone domestic sheep populations. Based on the empirical distribution (previously 1%), we identified 1193 genes as PSGs (Fig. 2, Supplementary table S11). According to the statistical results, a total of 151 genes have been strongly subjected to selection pressure across the Fst, LSBL, and XPEHH methods (Supplementary figure S2). KEGG and GO term analysis of these PSGs revealed that 41 pathways and 2894 terms were significantly overrepresented (Supplementary tables S12 and S13). The MAPK signaling pathway included 28 genes (TNIK, NOTCH2, SPAG9, PDE6G, SLC30A10, PTPRC, NRP1, PLCB1, PLCG2, FBXW7, PTPN11, LIF, CSPG4, EGFR, ERBB4, PTK2B, ADRB3, FGFR1, INSR, FLT4, FGFR4, ADRA1B, ADRA2C, KIT, FERMT2, BMP4, TPD52L1, SASH1). A total of 12 genes (ABCA12, CSNK1A1, KIT, ADAM10, RB1, CEP131, CDH11, PEPD, PTPN11, IFT122, TBX10, and STAR) were both detected by FST, LSBL, and XPEHH analysis. These results provide valuable insights into the genetic mechanisms associated with hyperpigmentation in LPB and shed light on the involvement of specific genes and pathways in this unique trait.

Fig. 2
figure 2

Genomic regions with strong selective signals (FST, LSBL, andXPEHH) in LPB identified by comparing genomes of refined groups of LPB and none black-bone domestic sheep population based on population structure analysis. The population genetic differentiation FST values, and the LSBL values are were plotted in 20 kb genomic bins with a 10 kb step. The significance threshold of the selection signature was arbitrarily assigned to the top 1% percentile outliers for each individual test and is indicated with red horizontal solid line

DEGs in two pigmentation developmental stages between LPB and LPN

Our previous studies demonstrated that hyperpigmentation in the LPB is the result of ectopic distribution of melanocytes in skin and internal tissues and subsequent melanin synthesized in melanocytes (unpublished data). To understand the differential gene expression associated with pigmentation development in LPB and LPN, RNA-Seq was performed on liver tissues at two different developmental stages: when pigmentation is not yet visible (2-month-old, stage 1) and when pigmentation becomes evident (2-years-old, stage 2). (Supplementary table S14). Liver tissue was chosen for this analysis due to its high melanin content in adult LPB. The total RNA extracted from sheep liver tissue showed an OD260/OD280 ratio between 1.8 and 2.1, and a total RNA concentration between 0.8 and 3 µg/µL (Supplementary table S15), indicating good purity and concentration of the extracted RNA. Concurrently, the gel electrophoresis results showed that three distinct bands corresponding to 28 S, 18 S, and 5 S rRNA were visible without DNA contamination or RNA degradation bands (Supplementary figure S3), indicating that the extracted RNA had good integrity. Approximately 1.35 billion raw reads were generated for the analysis (Supplementary table S16).

At stage 1, there were 13,959 and 13,402 expressed genes in the liver tissues of LPB and LPN, respectively. A total of 619 DEGs were identified through comparisons between LPB and LPN (Fig. 3A, Supplementary table S17). Notably, six of these DEGs were enriched in the melanogenesis pathway, with five of them being up-regulated (PRKACB, KIT, EDNRB, CAM, and CAMK) and one being down-regulated (Wnt). (Fig. 3B, Supplementary table S18). Additionally, GO analysis revealed significant enrichment in the molecular function of lipid binding (Supplementary table S19).

At stage 2, there were 13,917 and 13,518 expressed genes in the liver tissues of LPB and LPN, respectively, and 224 DEGs were identified. Among these, 127 DEGs were up-regulated (including 41 exclusively expressed in LPB), and 97 genes were down-regulated in LPB (Fig. 3C, Supplementary table S20). These DEGs were enriched in 18 significantly over-represented KEGG pathways (Fig. 3D, Supplementary table S21), but no pigmentation-related pathways were identified. Over-represented GO clusters were related to the regulation of cellular processes, oxidoreductase activity, and the extracellular region (Supplementary table S22).

Finally, we conducted a further analysis of the interaction between breed and age factors. A total of 35 DEGs were identified, including 7 significantly upregulated genes and 28 significantly downregulated genes (p < 0.05) (Fig. 3E, Supplementary table S23). Due to the small number, these DEGs were not significantly enriched in any KEGG pathways. We then performed GO enrichment analysis, which revealed 153 enriched pathways (p < 0.05). These pathways were primarily related to immune responses, including innate immune response, defense response to other organisms, response to biotic stimulus, immune response, negative regulation of viral process, and biological processes involved in interspecies interactions between organisms, and no pathways related to melanin were enriched (Supplementary table S24).

Fig. 3
figure 3

The volcano plots and KEGG pathways of DEGs between LPB and LPN at stages 1 and 2. A: Volcano plot of DEGs identified by comparing the expression levels in liver tissues of LPB and LPN at stage 1. B: The significant 21 KEGG pathways of DEGs at stage (1) C: Volcano plot of DEGs identified by comparing the expression levels in liver tissues of LPB and LPN at stage (2) D: The significant 18 KEGG pathways of DEGs at stage 2. E: Volcano plot of DEGs identified by comparing the expression levels in liver tissues of breed vs. age. F: The significant 24 GO pathways of DEGs for breed vs. age

Screening of candidate genes related to melanin pathway

To further assess the expression abundance of PSGs identified by FST and LSBL statistics, the overlapping genes between PSGs and DEGs in two pigmentation developmental stages were investigated. One gene, KIT, was found to be shared between PSGs and DEGs. KIT is a well-known key gene involved in melanocyte development and melanogenesis. Interestingly, KIT exhibited significantly higher FST and LSBL values, compared to adjacent genomic regions. This suggests a strong selective sweep acting on this gene, indicating its importance in hyperpigmentation. However, the study did not find significant differences in allele frequencies between LPB and LPN populations (Supplementary figure S4).

To identify candidate genes related to melanin synthesis in LPB, we screened all positive selected genes among four populations (LPB, LPN, none black bone Ovis aries, and wild sheep). Finally, two candidate genes (ERBB4 and ROR1) that may be related to melanin synthesis were identified. Both ERBB4 and ROR1 exhibited significantly higher FST, LSBL, and XPEHH values (Fig. 4A), indicating strong selection signals. Among all the positive selection genes screened, only ERBB4 and ROR1 showed significant gene frequency differentiation in different populations. Gene frequency analysis revealed that the frequency of mutant alleles was highest in LPB, followed by LPN, and was significantly lower in other populations (none black bone Ovis aries, and wild sheep). Notably, the frequencies of mutant alleles in LPB and LPN were very similar, suggesting a close genetic relationship (Fig. 4B). Even more noteworthy is that these two genes, ERBB4 and ROR1, represent novel findings not previously documented within the list of 688 known melanoma-associated genes [5]. To account for this genetic similarity, the sliding window gene flow analysis was performed, and the results showed that the ERBB4 and ROR1 genes were within the window D value range of TOP5%, indicating a relatively large gene flow in the LPB and LPN populations (Dataset S1). This genetic closeness may explain why ERBB4 and ROR1 genes were not identified in the transcriptome sequencing results of these two populations.

Fig. 4
figure 4

The relationship betweenERBB4andROR1genes and hyperpigmentation in sheep. A: FST, LSBL, and XPEHH values around the ERBB4 and ROR1 genes. The parts within the two black dashed lines correspond to gene positions. B: Genotype frequency heatmap of ERBB4 and ROR1 genes. C: Gene relative expression levels of ERBB4 and ROR1 genes in the liver, spleen, and kidneys of Lanping black-boned sheep (LPB) and Diqing sheep. Red represents LPB, blue represents Diqing Sheep

To eliminate the impact of genetic relatedness on transcriptional results, we further examined two genetically distant populations, LPB and Diqing sheep, using qRT-PCR to detect ERBB4 and ROR1 gene relative expression in various tissues. The results demonstrated significant differences in the relative expression levels of ERBB4 in the spleen and kidney between LPB and Diqing sheep (p < 0.0001), with ROR1 showing similar results in the liver, spleen, and kidney (Fig. 4C), indicating that these genes may play a role in hyperpigmentation in LPB.

Discussion

This study conducted genomic and transcriptomic analyses to explore the genetic basis of hyperpigmentation in LPB. Through genetic structure analysis, we identified both genetic similarities and differences between LPB and LPN, confirming that LPB is a distinct breed derived from LPN. Additionally, the genomic analysis identified several functional genes associated with pigmentation in LPB.

In this study, we performed RNA-Seq analysis to examine the DEGs in liver tissues of LPB and LPN at two distinct developmental stages to elucidate the genetic basis underlying pigmentation development. At the first stage (2-month-old, when pigmentation is not yet visible), five genes (PRKACB, KIT, EDNRB, CAM, and CAMK) were upregulated, suggesting their potential roles in the early stages of pigmentation. The downregulation of the Wnt gene may indicate its inhibitory role in melanogenesis. The significant enrichment in lipid binding functions further implies that these genes may could influence melanin biosynthesis through the regulation of lipid metabolism, offering new insights into the mechanisms underlying early pigmentation in LPB. In contrast, at the second stage (2-years-old, when pigmentation becomes evident), no gene were significantly enriched in pigmentation-related KEGG pathways. This suggests that by this stage, the genetic regulation of pigmentation might have already occurred, or could be governed by factors specific to other tissues. The overrepresented GO terms related to oxidoreductase activity and the regulation of cellular processes imply that the gene expression changes at this stage might be more related to maintaining cellular homeostasis and responding to environmental stressors, rather than directly contributing to melanin production. In summary, our findings highlight the crucial role of gene expression changes, particularly during early developmental stages, in the pigmentation process of LPB.

By combining the selective signature with DEGs, this study identified the KIT gene as potentially involved in hyperpigmentation in LPB. The biosynthesis of melanin is meticulously governed by a multifaceted interplay of intracellular signaling cascades, among which the most commonly implicated ones are MC1R/α-MSH [33], cAMP/PKA [34], MAPK/ERK [35], PI3K/Akt [36], and Wnt/β-catenin [37] signaling pathway. The KIT gene we detected is the member of MAPK signaling pathway. In terms of pigmentation, KIT gene mainly regulates the migration of melanocyte precursors to the dermis, epidermis, inner ear, and eye choroid along a specific path [38]. Some studies show that fox white coat was associated with deletion of KIT gene Exon 12 [39], white spots in Arabian camels were associated with KIT gene mutations [40], the translocation of KIT gene led to varying degrees of white spots in cattle [41]. Although the role of KIT gene in pigmentation is well-established in other species, no significant frequency differences in alleles were observed between the LPB and LPN populations for this gene. This finding suggests that additional factors may contribute to hyperpigmentation in LPB.

The ERBB4 and ROR1 genes were further identified, based on their strong selection signals and significant differences in allele frequencies among four populations: LPB, LPN, none black bone Ovis aries, and wild sheep. The ERBB4 gene encodes human epidermal growth factor 4 (HER-4) and plays a crucial role in cell signaling through autophosphorylation, which, in turn, regulates cell growth and division. Studies have reported that the expression rate of ERBB4 protein in melanoma is 66% [42], and the somatic mutation rate of ERBB4 in metastatic malignant melanoma is 18.99% [43], suggesting that ERBB4 may play an important role in melanoma. Receptor tyrosine kinase like orphan receptor 1 (ROR1) is a member of the RORs family that plays a role in the development process, including the growth and development of bones and neurons, cell motility, and cell polarity [44, 45]. ROR1 is highly expressed in melanoma cells and regulates the phosphorylation of Dvl-2, an upstream component of the Wnt pathway, thereby activating both classical and non-classical Wnt signaling pathways. ROR1 can positively regulate the expression of Wnt5a, while indirectly activating STAT3 to inhibit the transcription of melanocyte antigens (MART1, GP100, and promoters PAX3 and MITF), leading to decreased pigment proliferation and antigenicity of melanocytes [46]. Furthermore, in melanoma, the ROR1 gene sustainably undergoes phosphorylation, thus bypassing the Wnt signaling pathway initiated by Wnt5a to promote the occurrence and development of melanoma cells [47], and during the progression of melanoma, ROR1 can promote tumor cell invasion and metastasis by increasing the expression of ROR2 [48, 49].

In particular, we found similar allele frequency distributions in both LPB and LPN populations. These two breeds of sheep both live in the same region (Lanping County, Nujiang Prefecture, Yunnan Province, China), and their body shape and appearance are very similar, except for their black traits (such as black eyelids and gums). According to the survey, there is hybridization between the two populations, and our calculated gene flow results for both populations also confirm it, and this provides an explanation for the similarity in gene frequency distribution between those two populations. To mitigate the impact of gene flow, we selected two distant populations (LPB and Diqing sheep), which distributed in Nujiang Prefecture and Diqing Prefecture of Yunnan Province, respectively. Further qRT-PCR detection results showed that in the LPB population, the relative expression levels of ERBB4 and ROR1 genes in the liver, spleen, and kidney were significantly higher than those in the Diqing sheep population (except the liver for ERBB4 gene). The strong selection signals and allele frequency differences among populations suggest that ERBB4 and ROR1 could be key regulators of melanin synthesis and contribute to the black traits of LPB.

Conclusion

In conclusion, Lanping black-boned sheep (LPB) represent a valuable model for investigating the genetic mechanisms underlying hyperpigmentation in mammals. Through high-throughput whole-genome sequencing, we generated a vast dataset of single nucleotide polymorphisms (SNPs) within the LPB genome. The analysis of these autosomal SNPs provides evidence that LPB have evolved as a distinct breed from their none-black-boned counterparts (LPN). Furthermore, we pinpointed two genes, ERBB4 and ROR1, which are potentially associated with melanin synthesis and hyperpigmentation. These findings contribute to a more comprehensive understanding of non-cutaneous melanocyte development and the genetic basis of hyperpigmentation in LPB. The genetic insights obtained from this study not only shed light on the intriguing phenomenon of hyperpigmentation in LPB but also hold broader implications for our understanding of the genetic underpinnings of pigmentation in mammals.