A chromosome-scale reference genome assembly of the great sand eel, Hyperoplus lanceolatus
Despite increasing sequencing efforts, numerous fish families still lack a reference genome, which complicates genetic research. One such understudied family is the sand lances (Ammodytidae, literally: “sand burrower”), a globally distributed clade of over 30 fish species that tend to avoid tidal currents by burrowing into the sand. Here, we present the first annotated chromosome-level genome assembly of the great sand eel (Hyperoplus lanceolatus). The genome assembly was generated using Oxford Nanopore Technologies long sequencing reads and Illumina short reads for polishing. The final assembly has a total length of 808.5 Mbp, of which 97.1% were anchored into 24 chromosome-scale scaffolds using proximity-ligation scaffolding. It is highly contiguous with a scaffold and contig N50 of 33.7 and 31.3 Mbp, respectively, and has a BUSCO completeness score of 96.9%. The presented genome assembly is a valuable resource for future studies of sand lances, as this family is of great ecological and commercial importance and may also contribute to studies aiming to resolve the suprafamiliar taxonomy of bony fishes.
2697
- 10.1093/nar/gki442
- Jun 27, 2005
- Nucleic Acids Research
2387
- 10.1093/bioinformatics/bty149
- Mar 14, 2018
- Bioinformatics
754
- 10.1186/s12862-017-0958-3
- Jul 6, 2017
- BMC evolutionary biology
1720
- 10.1093/bioinformatics/btv566
- Oct 1, 2015
- Bioinformatics
746
- 10.1099/mgen.0.000132
- Sep 14, 2017
- Microbial Genomics
725
- 10.1093/molbev/mst141
- Aug 16, 2013
- Molecular Biology and Evolution
1177
- 10.1038/s41592-019-0669-3
- Dec 9, 2019
- Nature Methods
2
- 10.1080/17451000.2019.1662447
- Sep 17, 2019
- Marine Biology Research
262
- 10.1186/s12859-018-2203-5
- May 30, 2018
- BMC Bioinformatics
8
- 10.1007/s00436-017-5471-5
- May 17, 2017
- Parasitology Research
- Research Article
6
- 10.46471/gigabyte.105
- Jan 11, 2024
- GigaByte (Hong Kong, China)
The snake pipefish, Entelurus aequoreus (Linnaeus, 1758), is a northern Atlantic fish inhabiting open seagrass environments that recently expanded its distribution range. Here, we present a highly contiguous, near chromosome-scale genome of E. aequoreus. The final assembly spans 1.6Gbp in 7,391 scaffolds, with a scaffold N50 of 62.3Mbp and L50 of 12. The 28 largest scaffolds (>21 Mbp) span 89.7% of the assembly length. A BUSCO completeness score of 94.1% and a mapping rate above 98% suggest a high assembly completeness. Repetitive elements cover 74.93% of the genome, one of the highest proportions identified in vertebrates. Our demographic modeling identified a peak in population size during the last interglacial period, suggesting the species might benefit from warmer water conditions. Our updated snake pipefish assembly is essential for future analyses of the morphological and molecular changes unique to the Syngnathidae.
- Preprint Article
- 10.1101/2023.12.12.571260
- Dec 13, 2023
Abstract The snake pipefish,Entelurus aequoreus(Linnaeus, 1758), is a slender, up to 60 cm long, northern Atlantic fish that dwells in open seagrass habitats and has recently expanded its distribution range. The snake pipefish is part of the family Syngnathidae (seahorses and pipefish) that has undergone several characteristic morphological changes, such as loss of pelvic fins and elongated snout. Here, we present a highly contiguous, near chromosome-scale genome of the snake pipefish assembled as part of a university master’s course. The final assembly has a length of 1.6 Gbp in 7,391 scaffolds, a scaffold and contig N50 of 62.3 Mbp and 45.0 Mbp and L50 of 12 and 14, respectively. The largest 28 scaffolds (>21 Mbp) span 89.7% of the assembly length. A BUSCO completeness score of 94.1% and a mapping rate above 98% suggest a high assembly completeness. Repetitive elements cover 74.93% of the genome, one of the highest proportions so far identified in vertebrate genomes. Demographic modeling using the PSMC framework indicates a peak in effective population size (50 – 100 kya) during the last interglacial period and suggests that the species might largely benefit from warmer water conditions, as seen today. Our updated snake pipefish assembly forms an important foundation for further analysis of the morphological and molecular changes unique to the family Syngnathidae.
- Research Article
- 10.1186/s12915-025-02384-8
- Sep 15, 2025
- BMC Biology
BackgroundThe European hamster (Cricetus cricetus) was once a pest on European farmland, but its numbers have declined dramatically in recent decades, making it a critically endangered species throughout Europe and beyond. While it is strictly protected by EU law and several conservations, breeding and release programs have been initiated, and little is known about the level of genetic erosion and inbreeding on a European scale.ResultsHere, we present a chromosome-level de novo genome of a female hamster and a first population genomic analysis from the western range of the species’ distribution, using Illumina short reads (10 × coverage) from 34 individuals. The genome is 2.89 Gbp long, with 11 chromosome-level scaffolds and around 600 unplaced scaffolds and scaffolds N50 of 267 Mbp. The genome is above the average length of a mammalian genome and longer than that of other studied hamster species. Four distinct hamster populations with no admixture can be identified, indicating highly isolated populations with limited connectivity. Heterozygosity (Ho) is generally low (< 0.05%, comparable to polar bears) with some exceptions of populations with Ho near zero and a few with Ho as high as 0.2%.ConclusionsMost dramatically, the genomes of individuals used as founders for conservation breeding programs show exceptionally long runs of homozygosity, questioning its long-term suitability. This study confirms earlier concerns about the dramatically decreasing genetic diversity of the European hamster and provides a basis for future conservation efforts, which require consideration of population genetic factors.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12915-025-02384-8.
- Research Article
2
- 10.3354/aei00478
- Apr 11, 2024
- Aquaculture Environment Interactions
Low trophic aquaculture, including shellfish and seaweed farming, offers a potentially sustainable food source and may provide additional environmental benefits, including the creation of new feeding, breeding and nursery areas for fish of commercial and ecological importance. However, quantitative assessments of fish assemblages associated with aquaculture sites are lacking. We used pelagic baited remote underwater videos (BRUVs) and hook and line catches to survey summer fish assemblages at 2 integrated blue mussel Mytilus edulis and kelp (predominantly Saccharina latissima) farms in southwest UK. We recorded at least 11 finfish species across the surveys, including several of commercial importance, with farmed mussels and/or kelps supporting significantly higher levels of abundance and richness than reference areas outside farm infrastructure. Farmed kelp provided temporary habitat due to seasonal harvesting schedules, whereas farmed mussels provided greater habitat stability due to overlapping interannual growth cycles. Stomach content analysis of fish caught at the farms revealed that some low trophic level species had high proportions of amphipods in their stomachs, which also dominated epibiont assemblages at the farms. Higher trophic level fish stomachs contained several lower trophic level fish species, suggesting that farms provide new foraging grounds and support secondary and tertiary production. Although not identified to species level, juvenile fish were abundant at both farms, suggesting potential provisioning of nursery or breeding grounds; however, this needs further verification. Overall, this study provides evidence that shellfish and seaweed aquaculture can support and enhance populations of commercially and ecologically important fish species through habitat provisioning.
- Research Article
5
- 10.1094/mpmi-01-22-0008-a
- May 1, 2022
- Molecular Plant-Microbe Interactions®
Complete Genome Sequences of Four Strains ofErwiniatracheiphila: A Resource for Studying aBacterial Plant Pathogen with a Highly Complex Genome.
- Research Article
28
- 10.1093/gigascience/giz038
- May 1, 2019
- GigaScience
BackgroundThe Indian peafowl (Pavo cristanus) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT).ResultsONT sequencing gave ∼2.3-fold sequencing coverage, whereas Illumina generated 150–base pair paired-end sequence data at 284.6-fold coverage from 5 libraries. Subsequently, we generated a 0.915-gigabase pair de novo assembly of the peacock genome with a scaffold N50 of 0.23 megabase pairs (Mb). We predict that the peacock genome contains 23,153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences.ConclusionsWe report a high-quality assembly of the peacock genome using a hybrid approach of sequences generated by both Illumina and ONT. The long-read chemistry generated by ONT was useful for addressing challenges related to de novo assembly, particularly at regions containing repetitive sequences spanning longer than the read length, and which could not be resolved with only short-read–based assembly. Contig assembly of Illumina short reads gave an N50 of 1,639 bases, whereas with ONT, the N50 increased by >9-fold to 14,749 bases. The initial contig assembly based on Illumina sequencing reads alone gave 685,241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly of 15,025 super-scaffolds, with an N50 of ∼0.23 Mb. Ninety-five percent of proteins predicted by homology matched with those in a public repository, verifying the completeness of our assembly. Like other phylogenetic studies of avian conserved genes, we found P. cristatus to be most closely related to Gallus gallus, followed by Meleagris gallopavo and Anas platyrhynchos. Compared with the recently published peacock genome assembly, the current, superior, hybrid assembly has greater sequencing depth, fewer non-ATGC sequences, and fewer scaffolds.
- Research Article
4
- 10.1094/pdis-04-22-0843-a
- Jul 26, 2022
- Plant Disease
Genome Sequence Resource of Albifimbria verrucaria Causing the Leaf Spot Disease of the Spinach Plant Spinacia oleracea.
- Research Article
20
- 10.1093/gigascience/giy142
- Nov 29, 2018
- GigaScience
BackgroundThe barn swallow (Hirundo rustica) is a migratory bird that has been the focus of a large number of ecological, behavioral, and genetic studies. To facilitate further population genetics and genomic studies, we present a reference genome assembly for the European subspecies (H. r. rustica).FindingsAs part of the Genome10K effort on generating high-quality vertebrate genomes (Vertebrate Genomes Project), we have assembled a highly contiguous genome assembly using single molecule real-time (SMRT) DNA sequencing and several Bionano optical map technologies. We compared and integrated optical maps derived from both the Nick, Label, Repair, and Stain technology and from the Direct Label and Stain (DLS) technology. As proposed by Bionano, DLS more than doubled the scaffold N50 with respect to the nickase. The dual enzyme hybrid scaffold led to a further marginal increase in scaffold N50 and an overall increase of confidence in the scaffolds. After removal of haplotigs, the final assembly is approximately 1.21 Gbp in size, with a scaffold N50 value of more than 25.95 Mbp.ConclusionsThis high-quality genome assembly represents a valuable resource for future studies of population genetics and genomics in the barn swallow and for studies concerning the evolution of avian genomes. It also represents one of the very first genomes assembled by combining SMRT long-read sequencing with the new Bionano DLS technology for scaffolding. The quality of this assembly demonstrates the potential of this methodology to substantially increase the contiguity of genome assemblies.
- Research Article
5
- 10.1002/pld3.388
- Apr 1, 2022
- Plant direct
Cape Primroses (Streptocarpus, Gesneriaceae) are an ideal study system for investigating the genetics underlying species diversity in angiosperms. Streptocarpus rexii has served as a model species for plant developmental research for over five decades due to its unusual extended meristem activity present in the leaves. In this study, we sequenced and assembled the complete nuclear, chloroplast, and mitochondrial genomes of S. rexii using Oxford Nanopore Technologies long read sequencing. Two flow cells of PromethION sequencing resulted in 32 billion reads and were sufficient to generate a draft assembly including the chloroplast, mitochondrial and nuclear genomes, spanning 776 Mbp. The final nuclear genome assembly contained 5,855 contigs, spanning 766 Mbp of the 929‐Mbp haploid genome with an N50 of 3.7 Mbp and an L50 of 57 contigs. Over 70% of the draft genome was identified as repeats. A genome repeat library of Gesneriaceae was generated and used for genome annotation, with a total of 45,045 genes annotated in the S. rexii genome. Ks plots of the paranomes suggested a recent whole genome duplication event, shared between S. rexii and Primulina huaijiensis. A new chloroplast and mitochondrial genome assembly method, based on contig coverage and identification, was developed, and successfully used to assemble both organellar genomes of S. rexii . This method was developed into a pipeline and proved widely applicable. The nuclear genome of S. rexii and other datasets generated and reported here will be invaluable resources for further research to aid in the identification of genes involved in morphological variation underpinning plant diversification.
- Research Article
20
- 10.1101/gr.279334.124
- Nov 1, 2024
- Genome Research
The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, “telomere-to-telomere” genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT “Duplex” sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used “Pore-C” chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes.
- Research Article
1
- 10.1093/gbe/evae253
- Feb 3, 2025
- Genome Biology and Evolution
Reference genomes are key resources in biodiversity conservation. Yet, sequencing efforts are not evenly distributed across the tree of life raising concerns over our ability to enlighten conservation with genomic data. Good-quality reference genomes remain scarce in octocorals while these species are highly relevant targets for conservation. Here, we present the first annotated reference genome in the red coral, Corallium rubrum (Linnaeus, 1758), a habitat-forming octocoral from the Mediterranean and neighboring Atlantic, impacted by overharvesting and anthropogenic warming-induced mass mortality events. Combining long reads from Oxford Nanopore Technologies (ONT), Illumina paired-end reads for improving the base accuracy of the ONT-based genome assembly, and Arima Hi-C contact data to place the sequences into chromosomes, we assembled a genome of 532 Mb (20 chromosomes, 309 scaffolds) with contig and scaffold N50 of 1.6 and 18.5 Mb, respectively. Fifty percent of the sequence (L50) was contained in seven superscaffolds. The consensus quality value of the final assembly was 42, and the single and duplicated gene completeness reported by BUSCO was 86.4% and 1%, respectively (metazoa_odb10 database). We annotated 26,348 protein-coding genes and 34,548 noncoding transcripts. This annotated chromosome-level genome assembly, one of the first in octocorals and the first in Scleralcyonacea order, is currently used in a project based on whole-genome resequencing dedicated to the conservation and management of C. rubrum.
- Research Article
112
- 10.1016/j.molp.2020.04.009
- Apr 27, 2020
- Molecular Plant
The Chromosome-Level Reference Genome of Tea Tree Unveils Recent Bursts of Non-autonomous LTR Retrotransposons in Driving Genome Size Evolution
- Research Article
1
- 10.1094/mpmi-07-21-0165-a
- Jul 14, 2022
- Molecular plant-microbe interactions : MPMI
Genome and Transcriptome Sequence Resources and Effector Repertoire of Pythium myriotylum Drechsler.
- Research Article
2
- 10.1080/17451000.2019.1662447
- Sep 17, 2019
- Marine Biology Research
ABSTRACTThe sand lances (also known as sand-eels) are small fish belonging to family Ammodytidae that consists of 31 species belonging to 7 genera. Despite world-wide distribution, key role in the ecosystems and significant commercial importance, sand lances have rarely been objects of the cytogenetic studies. The chromosomes of small sand-eel (Ammodytes tobianus) and great sand-eel (Hyperoplus lanceolatus) were analysed for this study. Karyotypes of both species were composed of 48 acrocentric chromosomes (FN = 48). Chromosomes of the great sand-eel were stained equally with DAPI. However, small sand-eels exhibited one pair of chromosomes with DAPI-negative blocks of chromatin located on the q-arm in the vicinity of the centromeric regions. Major and minor rDNA sites were observed on the separate single chromosome pairs in both species. These results provide new data regarding the genetics of Ammodytidae, showing an ancestral teleostean type of karyotype, based on the number of chromosomes (48), their morphology (FN = 48) and the distribution of rDNA sequences.
- Research Article
18
- 10.3390/ijms24010649
- Dec 30, 2022
- International Journal of Molecular Sciences
Lepidopteran species are mostly pests, causing serious annual economic losses. High-quality genome sequencing and assembly uncover the genetic foundation of pest occurrence and provide guidance for pest control measures. Long-read sequencing technology and assembly algorithm advances have improved the ability to timeously produce high-quality genomes. Lepidoptera includes a wide variety of insects with high genetic diversity and heterozygosity. Therefore, the selection of an appropriate sequencing and assembly strategy to obtain high-quality genomic information is urgently needed. This research used silkworm as a model to test genome sequencing and assembly through high-coverage datasets by de novo assemblies. We report the first nearly complete telomere-to-telomere reference genome of silkworm Bombyx mori (P50T strain) produced by Pacific Biosciences (PacBio) HiFi sequencing, and highly contiguous and complete genome assemblies of two other silkworm strains by Oxford Nanopore Technologies (ONT) or PacBio continuous long-reads (CLR) that were unrepresented in the public database. Assembly quality was evaluated by use of BUSCO, Inspector, and EagleC. It is necessary to choose an appropriate assembler for draft genome construction, especially for low-depth datasets. For PacBio CLR and ONT sequencing, NextDenovo is superior. For PacBio HiFi sequencing, hifiasm is better. Quality assessment is essential for genome assembly and can provide better and more accurate results. For chromosome-level high-quality genome construction, we recommend using 3D-DNA with EagleC evaluation. Our study references how to obtain and evaluate high-quality genome assemblies, and is a resource for biological control, comparative genomics, and evolutionary studies of Lepidopteran pests and related species.
- Research Article
- 10.1186/s12863-025-01371-w
- Oct 29, 2025
- BMC Genomic Data
ObjectivesAngelica biserrata (commonly known as “Duhuo”), a traditional Chinese medicinal herb of the genus Angelica within the Apiaceae family, is clinically valued for its therapeutic effects in dispelling wind-dampness (a TCM syndrome manifesting as inflammatory arthritis, chronic headaches, and migratory pain) and alleviating arthralgia. Its pharmacological properties are primarily attributed to coumarins, to elucidate the molecular mechanisms underlying coumarin biosynthesis and facilitate the breeding of high-coumarin cultivars, we present the first draft genome assembly and annotation of A. biserrata. The first genome assembly of A. biserrata will provide novel insights into elucidating coumarin biosynthesis and advancing evolutionary biological studies.Data descriptionThe genome of A. biserrata was sequenced using PacBio HiFi technology, generating 8.83 million high-fidelity reads with an average length of 14.2 kb (125.34 Gb, sequencing coverage 41 ×). The reads were assembled to give a draft genome of 4.52 Gb with an N50 contig length of 35.72 Mb. Chromosome-scale scaffolding was then performed using 300.87 Gb Hi-C data, resulting in a final genome assembly of 3.89 Gb with improved continuity (contig N50 = 34.42 Mb, scaffold N50 = 325.77 Mb). The genomic integrity was 96.59% (based on the embryophyta database of OrthoDB 10) through the evaluation of universal single copy direct homologous gene (BUSCO). At the same time, 3811.62 Mb long sequences were attached to 11 chromosomes, accounting for 97.86%.
- Research Article
10
- 10.1093/dnares/dsac043
- Dec 1, 2022
- DNA Research
A high-quality genome assembly is imperative to explore the evolutionary basis of characteristic attributes that define chemotype and provide essential resources for a molecular breeding strategy for enhanced production of medicinal metabolites. Here, using single-molecule high-fidelity (HiFi) sequencing reads, we report chromosome-scale genome assembly for Chinese licorice (Glycyrrhiza uralensis), a widely used herbal and natural medicine. The entire genome assembly was achieved in eight chromosomes, with contig and scaffold N50 as 36.02 and 60.2 Mb, respectively. With only 17 assembly gaps and half of the chromosomes having no or one assembly gap, the presented genome assembly is among the best plant genomes to date. Our results showed an advantage of using highly accurate long-read HiFi sequencing data for assembling a highly heterozygous genome including its complexed repeat content. Additionally, our analysis revealed that G. uralensis experienced a recent whole-genome duplication at approximately 59.02 million years ago post a gamma (γ) whole-genome triplication event, which contributed to its present chemotype features. The metabolic gene cluster analysis identified 355 gene clusters, which included the entire biosynthesis pathway of glycyrrhizin. The genome assembly and its annotations provide an essential resource for licorice improvement through molecular breeding and the discovery of valuable genes for engineering bioactive components and understanding the evolution of specialized metabolites biosynthesis.
- Research Article
5
- 10.1111/pbi.14075
- Jun 15, 2023
- Plant Biotechnology Journal
The ricebean genome provides insight into Vigna genome evolution and facilitates genetic enhancement.
- Research Article
1
- 10.1093/gbe/evae105
- May 2, 2024
- Genome biology and evolution
In interactions between plants and herbivorous insects, the traits enabling phytophagous insects to overcome chemical defenses of their host plants have evolved multiple times. A prominent example of such adaptive key innovations in herbivorous insects is nitrile specifier proteins (NSPs) that enabled Pierinae butterflies to colonize Brassicales host plants that have a glucosinolate-myrosinase defense system. Although the evolutionary aspects of NSP-encoding genes have been studied in some Pierinae taxa (especially among Pieris butterflies), the ancestral evolutionary state of NSPs is unclear due to the limited genomic information available for species within Pierinae. Here, we generate a high-quality genome assembly and annotation of Leptosia nina, a member of a small tribe, Leptosiaini. L. nina uses as its main host Capparaceae plants, one of the ancestral hosts within Pierinae. By using ∼90-fold coverage of Oxford Nanopore long reads and Illumina short reads for subsequent polishing and error correction, we constructed a final genome assembly that consisted of 286 contigs with a total of 225.8 Mb and an N50 of 10.7 Mb. Genome annotation with transcriptome hints predicted 16,574 genes and covered 98.3% of BUSCO genes. A typical NSP gene is composed of three tandem domains found in Pierinae butterflies; unexpectedly, we found a new NSP-like gene in Pierinae composed of only two tandem domains. This newly found NSP-like gene in L. nina provides important insights into the evolutionary dynamics of domain and gene duplication events relating to host-plant adaptation in Pierinae butterflies.
- Research Article
- 10.1093/jhered/esaf013
- Jul 15, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esaf034
- May 30, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esaf007
- May 29, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esae078
- Apr 28, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esaf018
- Apr 24, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esaf025
- Apr 22, 2025
- Journal of Heredity
- Research Article
1
- 10.1093/jhered/esaf022
- Apr 19, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esaf023
- Apr 17, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esaf021
- Apr 15, 2025
- Journal of Heredity
- Research Article
- 10.1093/jhered/esaf010
- Mar 22, 2025
- Journal of Heredity
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.