Background & Summary

Anguillid eels are commercially important fish in East Asia, with approximately 270,000 metric tons of eels being cultivated worldwide1,2,3. Despite considerable and persistent efforts over many years, achieving a successful full-life cycle culture from egg to adult remains elusive in eels, distinguishing them from numerous other fish species where such comprehensive cultivation has become feasible4. Consequently, the eel farming industry relies heavily on the collection of wild glass eels that migrate towards estuaries or inland freshwater habitats5. Nonetheless, several factors, including impediments to migration, pollution, climate change, habitat loss, and overexploitation of juvenile glass eels, have significantly reduced eel population6,7. Notably, the Japanese eel is categorized as “Endangered,” while the European eel holds a “Critically Endangered” classification by the International Union for Conservation of Nature Red List (https://www.iucnredlist.org/search?query=anguilla&searchType=species). Consequently, alternative anguillid eel species, such as Anguilla marmorata and Anguilla bicolor, have garnered considerable interest8,9. Moreover, a recent study has shown that A. bicolor pacifica exhibits a faster growth rate than A. marmorata, indicating its potential suitability for cage culture10. Due to its comparable taste and texture, A. bicolor pacifica is recognized as the second-preferred choice, following A. japonica, indicating its significant economic importance concerning market demand11.

Anguilla bicolor (short-finned eel) is globally distributed throughout the Indo-Pacific region, ranging from East Africa to Papua New Guinea, including the Philippines and Indonesia12. However, because of its allopatric distribution and slight morphological variations, A. bicolor has been divided into two subspecies: A. bicolor bicolor (inhabiting the Indian Ocean) and A. bicolor pacifica (inhabiting the western Pacific Ocean)13.

Anguillid eels, known for their catadromous life patterns, adapt to varying environments throughout their life cycle, starting in marine environments as larvae, transitioning to brackish or inland shore waters as juveniles, and settling in fresh waters as adults14,15. Their notable resilience to an extensive range of salt concentrations provides opportunities to investigate how they coordinate osmotic pressure during migration. Understanding the osmoregulatory mechanisms will provide valuable insights into the ability of anguillid eels to achieve homeostasis. Currently, the genomic sequences of six eel species are available. Chromosome-level genome assemblies have been established for three eels: A. japonica16, A. anguilla17, and A. rostrata (GCA_018555375.3). The genomes of A. obscura, A. marmorata, and A. megastoma were assembled at the scaffold level18. However, genomic information on A. bicolor pacifica is lacking.

In summary, the genomic resources presented in this study are valuable for studying the molecular mechanisms that drive evolutionary adaptations in migratory euryhaline fish. Additionally, the chromosome-scale genome of short-finned eels will facilitate comparative genomic studies, which will shed light on the adaptive strategies employed by catadromous fish that enable them to survive and thrive across a diverse range of saline environments.

Methods

Ethics statement

The experimental protocols were approved by the Institutional Animal Care and Use Committee (IACUC) of Chonnam National University (CNU IACUC-YS-2023-9).

DNA sample collection, library construction, and DNA sequencing

Short-finned eels (Anguilla bicolor pacifica) collected from an aquafarm in Jeonnam, South Korea, were transported to the laboratory and kept in a 250 L aerated tank at a water temperature of 24 °C (Supplementary Table S1). For DNA extraction, the muscle tissue was sampled from a male eel with a standard body length of 20 cm (Supplementary Figure S1). A short-fragment library was generated using the TruSeq DNA Nano 550 bp kit with an insert size of 550 bp. The paired-end library was sequenced using an Illumina NovaSeq 6000 platform. The DNA, ranging from 2 to 5 μg were placed in a single lane of a BluePippin 0.75% gel. Electrophoresis was used to collect libraries of 9–13 kb and > 15 kb. The library was sequenced by circular consensus sequencing (CCS) on a PacBio Revio platform. Genomic DNA was extracted from the muscle tissue of an individual eel. Two platforms were used for the DNA sequencing. In total, 45.3 Gb (50 × coverage) of Illumina reads and 83.24 Gb (92 × coverage) of HiFi reads were generated (Table 1).

Table 1 Statistical results for the sequencing data of the Anguilla bicolor pacifica.

Genome size estimation, genome assembly, and quality assessment

We removed any Illumina reads shorter than 120 bp that contained adaptor sequences, low quality (Phred score < 20), or unknown bases (Ns) using Trim_galore (ver. 0.6.7)19. The trimmed reads were then used to count the 21-mer reads using jellyfish (ver. 2.3.0)20. Subsequently, based on the 21-mer histogram, the genome size of A. bicolor pacifica was estimated to be 899.9 Mb, with a heterozygosity rate of 1.25% by GenomeScope (ver. 2.0)21 (Supplementary Figure S2).

HiFi reads were used for draft assembly to produce highly contiguous draft contigs using Hifiasm (ver. 0.16.1)22. This process resulted in the generation of 405 contigs with a total length of 1.09 Gb and contig N50 of 16.9 Mb (Table 2).

Table 2 Statistical results of Anguilla bicolor pacifica genome assembly.

To improve contig contiguity, final scaffolds were generated using RagTag (ver. 2.1.0)23. In scaffolding process, chromosomes of A. japonica genome were used as a reference16. During this process, gaps, indicated by “N” characters, were inserted between adjacent query sequences. These gaps represent regions within the sequences that remained unidentified. This step included reordering, orienting, and connecting the sequences using these gaps. Consequently, 405 contigs were integrated into 30 linear scaffolds. This final assembly comprised 19 pseudochromosome-level scaffolds (99.76%) and 11 unplaced scaffolds (0.24%) with an N50 value of 61.07 Mb and a total length of 1.09 Gb (Fig. 1a, Tables 2, 3).

Fig. 1
figure 1

(a) Genomic landscape of the assembled pseudochromosomes of Anguilla bicolor pacifica, featuring thick marks every 10 Mb including the gene density (red), and GC content (green) calculated using a 100 Kb sliding window approach. Gaps between contigs were marked with blue lines. (b) Syntenic dot plots of the whole genome of A. bicolor pacifica against A. japonica. The x- and y-axes represent the 19 pseudochromosomes of the A. bicolor pacifica and chromosomes of A. japonica genomes, respectively.

Table 3 Statistical results of the 19 pseudochromosomes of Anguilla bicolor pacifica.

To evaluate the completeness of the assembled short-finned eel genome, benchmark universal single-copy orthologs (BUSCO) (ver. 5.4.3)24 was used to compare the 3,640 orthologous genes present in Actinopterygii_odb10. The GC contents and genome sizes of the seven anguillid species were found to be comparable (Table 4). Additionally, Bowtie2 (ver. 2.4.5)25 was used to align Illumina short reads generated from DNA using the following parameters:--no-unal --very-sensitive-local. This resulted in an alignment ratio of 99.41%. Finally, genome quality was examined using QUAST (ver. 5.2.0)26.

Table 4 Comparative analysis of the genomes of six anguillids and Anguilla bicolor pacifica.

Transcriptome sequencing and assembly

Four different tissue types, namely, the eye, heart, liver, and muscle, were collected from an individual eel. All collected samples were immediately preserved in RNAlater and stored at −80 °C until RNA extraction. Total RNA was extracted from the four samples using a TruSeq Stranded mRNA sample preparation kit following the manufacturer’s protocol. Complementary DNA libraries were constructed and sequenced using the Illumina NovaSeq 6000 platform to generate 5.24–6.91 Gb of paired-end reads. A total of 21.83 Gb of clean reads was obtained using Trim_galore and assembled using Trinity (ver. 2.15.1)27 through the default option (see Table 1).

Genome structure annotation

To identify and screen repetitive sequences within the genomes of short-finned eels, we integrated homology- and de novo-based prediction approaches using RepeatModeler (ver. 2.0.1) and RepeatMasker (ver. 4.1.2)28,29. A. bicolor pacifica repeat library was annotated by RepeatModeler with the National Center for Biotechnology Information (NCBI) searching engine RMBlast (ver. 2.9.0). This custom repeat library and two repeat libraries from Actinopterygii and Anguilla in Dfam30 were used by RepeatMasker. A total of 28.39% of the repetitive sequences were present in the genome of A. bicolor pacifica (Table 5).

Table 5 Statistics of repetitive sequences in the genome of Anguilla bicolor pacifica.

BRAKER Pipeline (ver. 3.0.6)31 was used to predict the gene models in the genome of A. bicolor pacifica. This process began with soft-masking repeats in the genomes generated using RepeatModeler and RepeatMasker. GeneMark-ETP (ver. 1)32 was used to generate hints from the RNA-Seq and protein data. For the protein data, we combined the Metazoa ortholog data from OrthoDB 1133 and six actinopterygian species (A. anguilla, GCF_013347855.1; Danio rerio, GCF_000002035.6; Pleuronectes platessa, GCF_947347685.1; Poecilia reticulata, GCF_000633615.1; Scleropages formosus, GCF_900964775.1; and Takifugu rubripes, GCF_901000725.2). To train the gene sets, ab initio gene prediction was performed using Augustus software (ver. 3.4.0)34, incorporating hints provided by GeneMark-ETP. Finally, the results were integrated with those of TSEBRA35. Gene models were annotated by combining evidence from homology, de novo, and transcriptome data, yielding 23,095 non-redundant protein-coding genes. The BUSCO analysis identified 3,448 (94.7%) actinopterygian orthologous genes (Table 6).

Table 6 Protein coding genes statistics and functional annotation results of the Anguilla bicolor pacifica genome.

Genome annotation

The functions of the integrated gene models were annotated using the SWISS-PROT protein database36 and the NCBI non-redundant database (https://www.ncbi.nlm.nih.gov/protein). The diamond (ver. 2.1.9.163)37 blastx was used with the following parameters: -–dbsize 530000000 -–max-targetseqs. 1 -–outfmt 6 -–evalue 1e-5. Furthermore, we used eggNOG-mapper (ver. 2.1.8)38 to annotate their functions against the Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), euKaryotic Orthologous Group (KOG), and protein families (Pfam) databases. The “--evalue 1e-5 -m diamond” parameters in eggnog-mapper were applied. In summary, 98.66% of gene models were functionally annotated using publicly accessible databases (Table 6).

Genome-wide collinearity analysis

Two anguillid genomes were used for comparative analyses. Macrosynteny pairs between short-finned and Japanese eels were obtained using MCscan with the default option (https://github.com/tanghaibao/jcvi/wiki/MCscan, Python version)39. Macrosynteny blocks were visualized using the Python scripts provided by MCscan. The 19 pseudochromosomes from A. bicolor pacifica showed a highly conserved collinear relation with the chromosomes of A. japonica (Fig. 1b).

Data Records

The raw sequencing data for this study are deposited in the NCBI under BioProject ID: PRJNA1073276. Illumina, transcriptome, and PacBio sequencing data are available under the Sequence Read Archive ID: SRR27869073–SRR2786907840. The assembled genome has been deposited in the GenBank database under the accession number JBDGNX02000000041. Additionally, assembled genome and annotations can be downloaded from Figshare42 under https://doi.org/10.6084/m9.figshare.25139891. All data sets used in this study are available at: http://eyunlab.cau.ac.kr/shortfinned_eel.

Technical Validation

Evaluation of genome assembly and annotation

Five methods were applied to evaluate the completeness, accuracy, and contiguity of the A. bicolor pacifica genome assembly. These included statistics of N50, BUSCO analysis, mapping of short reads of DNA to the genome, and comparison of synteny blocks in the genomes of A. bicolor pacifica and A. japonica. Furthermore, the total size of the assembled genome is similar to that estimated by jellyfish. All assessments indicated that the genome assembly was contiguous, and of high quality.