A chromosome-scale reference genome assembly of the great sand eel, Hyperoplus lanceolatus

  • Abstract
  • References
  • Citations
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Despite increasing sequencing efforts, numerous fish families still lack a reference genome, which complicates genetic research. One such understudied family is the sand lances (Ammodytidae, literally: “sand burrower”), a globally distributed clade of over 30 fish species that tend to avoid tidal currents by burrowing into the sand. Here, we present the first annotated chromosome-level genome assembly of the great sand eel (Hyperoplus lanceolatus). The genome assembly was generated using Oxford Nanopore Technologies long sequencing reads and Illumina short reads for polishing. The final assembly has a total length of 808.5 Mbp, of which 97.1% were anchored into 24 chromosome-scale scaffolds using proximity-ligation scaffolding. It is highly contiguous with a scaffold and contig N50 of 33.7 and 31.3 Mbp, respectively, and has a BUSCO completeness score of 96.9%. The presented genome assembly is a valuable resource for future studies of sand lances, as this family is of great ecological and commercial importance and may also contribute to studies aiming to resolve the suprafamiliar taxonomy of bony fishes.

ReferencesShowing 10 of 37 papers
  • Open Access Icon
  • Cite Count Icon 2697
  • 10.1093/nar/gki442
InterProScan: protein domains identifier
  • Jun 27, 2005
  • Nucleic Acids Research
  • E Quevillon + 6 more

  • Open Access Icon
  • PDF Download Icon
  • Cite Count Icon 2387
  • 10.1093/bioinformatics/bty149
NanoPack: visualizing and processing long-read sequencing data.
  • Mar 14, 2018
  • Bioinformatics
  • Wouter De Coster + 4 more

  • Open Access Icon
  • PDF Download Icon
  • Cite Count Icon 754
  • 10.1186/s12862-017-0958-3
Phylogenetic classification of bony fishes
  • Jul 6, 2017
  • BMC evolutionary biology
  • Ricardo Betancur-R + 7 more

  • Open Access Icon
  • Cite Count Icon 1720
  • 10.1093/bioinformatics/btv566
Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data
  • Oct 1, 2015
  • Bioinformatics
  • Konstantin Okonechnikov + 2 more

  • Open Access Icon
  • Cite Count Icon 746
  • 10.1099/mgen.0.000132
Completing bacterial genome assemblies with multiplex MinION sequencing.
  • Sep 14, 2017
  • Microbial Genomics
  • Ryan R Wick + 3 more

  • Open Access Icon
  • Cite Count Icon 725
  • 10.1093/molbev/mst141
MitoFish and MitoAnnotator: A Mitochondrial Genome Database of Fish with an Accurate and Automatic Annotation Pipeline
  • Aug 16, 2013
  • Molecular Biology and Evolution
  • W Iwasaki + 10 more

  • Open Access Icon
  • Cite Count Icon 1177
  • 10.1038/s41592-019-0669-3
Fast and accurate long-read assembly with wtdbg2.
  • Dec 9, 2019
  • Nature Methods
  • Jue Ruan + 1 more

  • Cite Count Icon 2
  • 10.1080/17451000.2019.1662447
First description of karyotypes and localization of ribosomal genes in two sand lances (Uranoscopiformes: Ammodytidae); small sand-eel (Ammodytes tobianus Linnaeus, 1758) and great sand-eel (Hyperoplus lanceolatus Le Sauvage, 1824)
  • Sep 17, 2019
  • Marine Biology Research
  • Konrad Ocalewicz + 4 more

  • Open Access Icon
  • PDF Download Icon
  • Cite Count Icon 262
  • 10.1186/s12859-018-2203-5
Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi
  • May 30, 2018
  • BMC Bioinformatics
  • Jens Keilwagen + 4 more

  • Cite Count Icon 8
  • 10.1007/s00436-017-5471-5
Great sandeel (Hyperoplus lanceolatus) as a putative transmitter of parasite Contracaecum osculatum (Nematoda: Anisakidae).
  • May 17, 2017
  • Parasitology Research
  • K Nadolna-Ałtyn + 2 more

CitationsShowing 4 of 4 papers
  • Open Access Icon
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 6
  • 10.46471/gigabyte.105
Near chromosome-level and highly repetitive genome assembly of the snake pipefish Entelurus aequoreus (Syngnathiformes: Syngnathidae).
  • Jan 11, 2024
  • GigaByte (Hong Kong, China)
  • Magnus Wolf + 19 more

The snake pipefish, Entelurus aequoreus (Linnaeus, 1758), is a northern Atlantic fish inhabiting open seagrass environments that recently expanded its distribution range. Here, we present a highly contiguous, near chromosome-scale genome of E. aequoreus. The final assembly spans 1.6Gbp in 7,391 scaffolds, with a scaffold N50 of 62.3Mbp and L50 of 12. The 28 largest scaffolds (>21 Mbp) span 89.7% of the assembly length. A BUSCO completeness score of 94.1% and a mapping rate above 98% suggest a high assembly completeness. Repetitive elements cover 74.93% of the genome, one of the highest proportions identified in vertebrates. Our demographic modeling identified a peak in population size during the last interglacial period, suggesting the species might benefit from warmer water conditions. Our updated snake pipefish assembly is essential for future analyses of the morphological and molecular changes unique to the Syngnathidae.

  • Open Access Icon
  • Preprint Article
  • 10.1101/2023.12.12.571260
Near chromosome-level and highly repetitive genome assembly of the snake pipefishEntelurus aequoreus(Syngnathiformes: Syngnathidae)
  • Dec 13, 2023
  • Magnus Wolf + 19 more

Abstract The snake pipefish,Entelurus aequoreus(Linnaeus, 1758), is a slender, up to 60 cm long, northern Atlantic fish that dwells in open seagrass habitats and has recently expanded its distribution range. The snake pipefish is part of the family Syngnathidae (seahorses and pipefish) that has undergone several characteristic morphological changes, such as loss of pelvic fins and elongated snout. Here, we present a highly contiguous, near chromosome-scale genome of the snake pipefish assembled as part of a university master’s course. The final assembly has a length of 1.6 Gbp in 7,391 scaffolds, a scaffold and contig N50 of 62.3 Mbp and 45.0 Mbp and L50 of 12 and 14, respectively. The largest 28 scaffolds (>21 Mbp) span 89.7% of the assembly length. A BUSCO completeness score of 94.1% and a mapping rate above 98% suggest a high assembly completeness. Repetitive elements cover 74.93% of the genome, one of the highest proportions so far identified in vertebrate genomes. Demographic modeling using the PSMC framework indicates a peak in effective population size (50 – 100 kya) during the last interglacial period and suggests that the species might largely benefit from warmer water conditions, as seen today. Our updated snake pipefish assembly forms an important foundation for further analysis of the morphological and molecular changes unique to the family Syngnathidae.

  • Research Article
  • 10.1186/s12915-025-02384-8
Chromosome-level genome of the European hamster (Cricetus cricetus) and its genome-wide population structure across Western Europe
  • Sep 15, 2025
  • BMC Biology
  • Tobias Erik Reiners + 9 more

BackgroundThe European hamster (Cricetus cricetus) was once a pest on European farmland, but its numbers have declined dramatically in recent decades, making it a critically endangered species throughout Europe and beyond. While it is strictly protected by EU law and several conservations, breeding and release programs have been initiated, and little is known about the level of genetic erosion and inbreeding on a European scale.ResultsHere, we present a chromosome-level de novo genome of a female hamster and a first population genomic analysis from the western range of the species’ distribution, using Illumina short reads (10 × coverage) from 34 individuals. The genome is 2.89 Gbp long, with 11 chromosome-level scaffolds and around 600 unplaced scaffolds and scaffolds N50 of 267 Mbp. The genome is above the average length of a mammalian genome and longer than that of other studied hamster species. Four distinct hamster populations with no admixture can be identified, indicating highly isolated populations with limited connectivity. Heterozygosity (Ho) is generally low (< 0.05%, comparable to polar bears) with some exceptions of populations with Ho near zero and a few with Ho as high as 0.2%.ConclusionsMost dramatically, the genomes of individuals used as founders for conservation breeding programs show exceptionally long runs of homozygosity, questioning its long-term suitability. This study confirms earlier concerns about the dramatically decreasing genetic diversity of the European hamster and provides a basis for future conservation efforts, which require consideration of population genetic factors.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12915-025-02384-8.

  • Open Access Icon
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3354/aei00478
Quantification of finfish assemblages associated with mussel and seaweed farms in southwest UK provides evidence of potential benefits to fisheries
  • Apr 11, 2024
  • Aquaculture Environment Interactions
  • S Corrigan + 3 more

Low trophic aquaculture, including shellfish and seaweed farming, offers a potentially sustainable food source and may provide additional environmental benefits, including the creation of new feeding, breeding and nursery areas for fish of commercial and ecological importance. However, quantitative assessments of fish assemblages associated with aquaculture sites are lacking. We used pelagic baited remote underwater videos (BRUVs) and hook and line catches to survey summer fish assemblages at 2 integrated blue mussel Mytilus edulis and kelp (predominantly Saccharina latissima) farms in southwest UK. We recorded at least 11 finfish species across the surveys, including several of commercial importance, with farmed mussels and/or kelps supporting significantly higher levels of abundance and richness than reference areas outside farm infrastructure. Farmed kelp provided temporary habitat due to seasonal harvesting schedules, whereas farmed mussels provided greater habitat stability due to overlapping interannual growth cycles. Stomach content analysis of fish caught at the farms revealed that some low trophic level species had high proportions of amphipods in their stomachs, which also dominated epibiont assemblages at the farms. Higher trophic level fish stomachs contained several lower trophic level fish species, suggesting that farms provide new foraging grounds and support secondary and tertiary production. Although not identified to species level, juvenile fish were abundant at both farms, suggesting potential provisioning of nursery or breeding grounds; however, this needs further verification. Overall, this study provides evidence that shellfish and seaweed aquaculture can support and enhance populations of commercially and ecologically important fish species through habitat provisioning.

Similar Papers
  • Research Article
  • Cite Count Icon 5
  • 10.1094/mpmi-01-22-0008-a
Complete Genome Sequences of Four Strains ofErwiniatracheiphila: A Resource for Studying aBacterial Plant Pathogen with a Highly Complex Genome.
  • May 1, 2022
  • Molecular Plant-Microbe Interactions®
  • Breah Lasarre + 5 more

Complete Genome Sequences of Four Strains ofErwiniatracheiphila: A Resource for Studying aBacterial Plant Pathogen with a Highly Complex Genome.

  • Research Article
  • Cite Count Icon 28
  • 10.1093/gigascience/giz038
De novo assembly of the Indian blue peacock (Pavo cristatus) genome using Oxford Nanopore technology and Illumina sequencing
  • May 1, 2019
  • GigaScience
  • Ruby Dhar + 13 more

BackgroundThe Indian peafowl (Pavo cristanus) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT).ResultsONT sequencing gave ∼2.3-fold sequencing coverage, whereas Illumina generated 150–base pair paired-end sequence data at 284.6-fold coverage from 5 libraries. Subsequently, we generated a 0.915-gigabase pair de novo assembly of the peacock genome with a scaffold N50 of 0.23 megabase pairs (Mb). We predict that the peacock genome contains 23,153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences.ConclusionsWe report a high-quality assembly of the peacock genome using a hybrid approach of sequences generated by both Illumina and ONT. The long-read chemistry generated by ONT was useful for addressing challenges related to de novo assembly, particularly at regions containing repetitive sequences spanning longer than the read length, and which could not be resolved with only short-read–based assembly. Contig assembly of Illumina short reads gave an N50 of 1,639 bases, whereas with ONT, the N50 increased by >9-fold to 14,749 bases. The initial contig assembly based on Illumina sequencing reads alone gave 685,241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly of 15,025 super-scaffolds, with an N50 of ∼0.23 Mb. Ninety-five percent of proteins predicted by homology matched with those in a public repository, verifying the completeness of our assembly. Like other phylogenetic studies of avian conserved genes, we found P. cristatus to be most closely related to Gallus gallus, followed by Meleagris gallopavo and Anas platyrhynchos. Compared with the recently published peacock genome assembly, the current, superior, hybrid assembly has greater sequencing depth, fewer non-ATGC sequences, and fewer scaffolds.

  • Research Article
  • Cite Count Icon 4
  • 10.1094/pdis-04-22-0843-a
Genome Sequence Resource of Albifimbria verrucaria Causing the Leaf Spot Disease of the Spinach Plant Spinacia oleracea.
  • Jul 26, 2022
  • Plant Disease
  • Chunyue Chai + 4 more

Genome Sequence Resource of Albifimbria verrucaria Causing the Leaf Spot Disease of the Spinach Plant Spinacia oleracea.

  • Research Article
  • Cite Count Icon 20
  • 10.1093/gigascience/giy142
SMRT long reads and Direct Label and Stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica).
  • Nov 29, 2018
  • GigaScience
  • Giulio Formenti + 8 more

BackgroundThe barn swallow (Hirundo rustica) is a migratory bird that has been the focus of a large number of ecological, behavioral, and genetic studies. To facilitate further population genetics and genomic studies, we present a reference genome assembly for the European subspecies (H. r. rustica).FindingsAs part of the Genome10K effort on generating high-quality vertebrate genomes (Vertebrate Genomes Project), we have assembled a highly contiguous genome assembly using single molecule real-time (SMRT) DNA sequencing and several Bionano optical map technologies. We compared and integrated optical maps derived from both the Nick, Label, Repair, and Stain technology and from the Direct Label and Stain (DLS) technology. As proposed by Bionano, DLS more than doubled the scaffold N50 with respect to the nickase. The dual enzyme hybrid scaffold led to a further marginal increase in scaffold N50 and an overall increase of confidence in the scaffolds. After removal of haplotigs, the final assembly is approximately 1.21 Gbp in size, with a scaffold N50 value of more than 25.95 Mbp.ConclusionsThis high-quality genome assembly represents a valuable resource for future studies of population genetics and genomics in the barn swallow and for studies concerning the evolution of avian genomes. It also represents one of the very first genomes assembled by combining SMRT long-read sequencing with the new Bionano DLS technology for scaffolding. The quality of this assembly demonstrates the potential of this methodology to substantially increase the contiguity of genome assemblies.

  • Research Article
  • Cite Count Icon 5
  • 10.1002/pld3.388
The first genome for the Cape Primrose Streptocarpus rexii (Gesneriaceae), a model plant for studying meristem-driven shoot diversity.
  • Apr 1, 2022
  • Plant direct
  • Kanae Nishii + 8 more

Cape Primroses (Streptocarpus, Gesneriaceae) are an ideal study system for investigating the genetics underlying species diversity in angiosperms. Streptocarpus rexii has served as a model species for plant developmental research for over five decades due to its unusual extended meristem activity present in the leaves. In this study, we sequenced and assembled the complete nuclear, chloroplast, and mitochondrial genomes of S. rexii using Oxford Nanopore Technologies long read sequencing. Two flow cells of PromethION sequencing resulted in 32 billion reads and were sufficient to generate a draft assembly including the chloroplast, mitochondrial and nuclear genomes, spanning 776 Mbp. The final nuclear genome assembly contained 5,855 contigs, spanning 766 Mbp of the 929‐Mbp haploid genome with an N50 of 3.7 Mbp and an L50 of 57 contigs. Over 70% of the draft genome was identified as repeats. A genome repeat library of Gesneriaceae was generated and used for genome annotation, with a total of 45,045 genes annotated in the S. rexii genome. Ks plots of the paranomes suggested a recent whole genome duplication event, shared between S. rexii and Primulina huaijiensis. A new chloroplast and mitochondrial genome assembly method, based on contig coverage and identification, was developed, and successfully used to assemble both organellar genomes of S. rexii . This method was developed into a pipeline and proved widely applicable. The nuclear genome of S. rexii and other datasets generated and reported here will be invaluable resources for further research to aid in the identification of genes involved in morphological variation underpinning plant diversification.

  • Research Article
  • Cite Count Icon 20
  • 10.1101/gr.279334.124
Gapless assembly of complete human and plant chromosomes using only nanopore sequencing
  • Nov 1, 2024
  • Genome Research
  • Sergey Koren + 26 more

The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, “telomere-to-telomere” genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT “Duplex” sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used “Pore-C” chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes.

  • Research Article
  • Cite Count Icon 1
  • 10.1093/gbe/evae253
Chromosome-Level Genome Assembly and Annotation of Corallium rubrum: A Mediterranean Coral Threatened by Overharvesting and Climate Change
  • Feb 3, 2025
  • Genome Biology and Evolution
  • Jean-Baptiste Ledoux + 18 more

Reference genomes are key resources in biodiversity conservation. Yet, sequencing efforts are not evenly distributed across the tree of life raising concerns over our ability to enlighten conservation with genomic data. Good-quality reference genomes remain scarce in octocorals while these species are highly relevant targets for conservation. Here, we present the first annotated reference genome in the red coral, Corallium rubrum (Linnaeus, 1758), a habitat-forming octocoral from the Mediterranean and neighboring Atlantic, impacted by overharvesting and anthropogenic warming-induced mass mortality events. Combining long reads from Oxford Nanopore Technologies (ONT), Illumina paired-end reads for improving the base accuracy of the ONT-based genome assembly, and Arima Hi-C contact data to place the sequences into chromosomes, we assembled a genome of 532 Mb (20 chromosomes, 309 scaffolds) with contig and scaffold N50 of 1.6 and 18.5 Mb, respectively. Fifty percent of the sequence (L50) was contained in seven superscaffolds. The consensus quality value of the final assembly was 42, and the single and duplicated gene completeness reported by BUSCO was 86.4% and 1%, respectively (metazoa_odb10 database). We annotated 26,348 protein-coding genes and 34,548 noncoding transcripts. This annotated chromosome-level genome assembly, one of the first in octocorals and the first in Scleralcyonacea order, is currently used in a project based on whole-genome resequencing dedicated to the conservation and management of C. rubrum.

  • Research Article
  • Cite Count Icon 112
  • 10.1016/j.molp.2020.04.009
The Chromosome-Level Reference Genome of Tea Tree Unveils Recent Bursts of Non-autonomous LTR Retrotransposons in Driving Genome Size Evolution
  • Apr 27, 2020
  • Molecular Plant
  • Qun-Jie Zhang + 21 more

The Chromosome-Level Reference Genome of Tea Tree Unveils Recent Bursts of Non-autonomous LTR Retrotransposons in Driving Genome Size Evolution

  • Research Article
  • Cite Count Icon 1
  • 10.1094/mpmi-07-21-0165-a
Genome and Transcriptome Sequence Resources and Effector Repertoire of Pythium myriotylum Drechsler.
  • Jul 14, 2022
  • Molecular plant-microbe interactions : MPMI
  • Gayathri R Satheesh + 3 more

Genome and Transcriptome Sequence Resources and Effector Repertoire of Pythium myriotylum Drechsler.

  • Research Article
  • Cite Count Icon 2
  • 10.1080/17451000.2019.1662447
First description of karyotypes and localization of ribosomal genes in two sand lances (Uranoscopiformes: Ammodytidae); small sand-eel (Ammodytes tobianus Linnaeus, 1758) and great sand-eel (Hyperoplus lanceolatus Le Sauvage, 1824)
  • Sep 17, 2019
  • Marine Biology Research
  • Konrad Ocalewicz + 4 more

ABSTRACTThe sand lances (also known as sand-eels) are small fish belonging to family Ammodytidae that consists of 31 species belonging to 7 genera. Despite world-wide distribution, key role in the ecosystems and significant commercial importance, sand lances have rarely been objects of the cytogenetic studies. The chromosomes of small sand-eel (Ammodytes tobianus) and great sand-eel (Hyperoplus lanceolatus) were analysed for this study. Karyotypes of both species were composed of 48 acrocentric chromosomes (FN = 48). Chromosomes of the great sand-eel were stained equally with DAPI. However, small sand-eels exhibited one pair of chromosomes with DAPI-negative blocks of chromatin located on the q-arm in the vicinity of the centromeric regions. Major and minor rDNA sites were observed on the separate single chromosome pairs in both species. These results provide new data regarding the genetics of Ammodytidae, showing an ancestral teleostean type of karyotype, based on the number of chromosomes (48), their morphology (FN = 48) and the distribution of rDNA sequences.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 18
  • 10.3390/ijms24010649
Comparison of Long-Read Methods for Sequencing and Assembly of Lepidopteran Pest Genomes
  • Dec 30, 2022
  • International Journal of Molecular Sciences
  • Tong Zhang + 6 more

Lepidopteran species are mostly pests, causing serious annual economic losses. High-quality genome sequencing and assembly uncover the genetic foundation of pest occurrence and provide guidance for pest control measures. Long-read sequencing technology and assembly algorithm advances have improved the ability to timeously produce high-quality genomes. Lepidoptera includes a wide variety of insects with high genetic diversity and heterozygosity. Therefore, the selection of an appropriate sequencing and assembly strategy to obtain high-quality genomic information is urgently needed. This research used silkworm as a model to test genome sequencing and assembly through high-coverage datasets by de novo assemblies. We report the first nearly complete telomere-to-telomere reference genome of silkworm Bombyx mori (P50T strain) produced by Pacific Biosciences (PacBio) HiFi sequencing, and highly contiguous and complete genome assemblies of two other silkworm strains by Oxford Nanopore Technologies (ONT) or PacBio continuous long-reads (CLR) that were unrepresented in the public database. Assembly quality was evaluated by use of BUSCO, Inspector, and EagleC. It is necessary to choose an appropriate assembler for draft genome construction, especially for low-depth datasets. For PacBio CLR and ONT sequencing, NextDenovo is superior. For PacBio HiFi sequencing, hifiasm is better. Quality assessment is essential for genome assembly and can provide better and more accurate results. For chromosome-level high-quality genome construction, we recommend using 3D-DNA with EagleC evaluation. Our study references how to obtain and evaluate high-quality genome assemblies, and is a resource for biological control, comparative genomics, and evolutionary studies of Lepidopteran pests and related species.

  • Research Article
  • 10.1186/s12863-025-01371-w
Draft genome of Angelica Biserrata, a traditional Chinese medicinal herb of the Angelica genus (Apiaceae)
  • Oct 29, 2025
  • BMC Genomic Data
  • Yuan-Jiang Xu + 5 more

ObjectivesAngelica biserrata (commonly known as “Duhuo”), a traditional Chinese medicinal herb of the genus Angelica within the Apiaceae family, is clinically valued for its therapeutic effects in dispelling wind-dampness (a TCM syndrome manifesting as inflammatory arthritis, chronic headaches, and migratory pain) and alleviating arthralgia. Its pharmacological properties are primarily attributed to coumarins, to elucidate the molecular mechanisms underlying coumarin biosynthesis and facilitate the breeding of high-coumarin cultivars, we present the first draft genome assembly and annotation of A. biserrata. The first genome assembly of A. biserrata will provide novel insights into elucidating coumarin biosynthesis and advancing evolutionary biological studies.Data descriptionThe genome of A. biserrata was sequenced using PacBio HiFi technology, generating 8.83 million high-fidelity reads with an average length of 14.2 kb (125.34 Gb, sequencing coverage 41 ×). The reads were assembled to give a draft genome of 4.52 Gb with an N50 contig length of 35.72 Mb. Chromosome-scale scaffolding was then performed using 300.87 Gb Hi-C data, resulting in a final genome assembly of 3.89 Gb with improved continuity (contig N50 = 34.42 Mb, scaffold N50 = 325.77 Mb). The genomic integrity was 96.59% (based on the embryophyta database of OrthoDB 10) through the evaluation of universal single copy direct homologous gene (BUSCO). At the same time, 3811.62 Mb long sequences were attached to 11 chromosomes, accounting for 97.86%.

  • Research Article
  • Cite Count Icon 10
  • 10.1093/dnares/dsac043
Chromosome-scale genome assembly of Glycyrrhiza uralensis revealed metabolic gene cluster centred specialized metabolites biosynthesis
  • Dec 1, 2022
  • DNA Research
  • Amit Rai + 11 more

A high-quality genome assembly is imperative to explore the evolutionary basis of characteristic attributes that define chemotype and provide essential resources for a molecular breeding strategy for enhanced production of medicinal metabolites. Here, using single-molecule high-fidelity (HiFi) sequencing reads, we report chromosome-scale genome assembly for Chinese licorice (Glycyrrhiza uralensis), a widely used herbal and natural medicine. The entire genome assembly was achieved in eight chromosomes, with contig and scaffold N50 as 36.02 and 60.2 Mb, respectively. With only 17 assembly gaps and half of the chromosomes having no or one assembly gap, the presented genome assembly is among the best plant genomes to date. Our results showed an advantage of using highly accurate long-read HiFi sequencing data for assembling a highly heterozygous genome including its complexed repeat content. Additionally, our analysis revealed that G. uralensis experienced a recent whole-genome duplication at approximately 59.02 million years ago post a gamma (γ) whole-genome triplication event, which contributed to its present chemotype features. The metabolic gene cluster analysis identified 355 gene clusters, which included the entire biosynthesis pathway of glycyrrhizin. The genome assembly and its annotations provide an essential resource for licorice improvement through molecular breeding and the discovery of valuable genes for engineering bioactive components and understanding the evolution of specialized metabolites biosynthesis.

  • Research Article
  • Cite Count Icon 5
  • 10.1111/pbi.14075
The ricebean genome provides insight into Vigna genome evolution and facilitates genetic enhancement.
  • Jun 15, 2023
  • Plant Biotechnology Journal
  • Aleena Francis + 16 more

The ricebean genome provides insight into Vigna genome evolution and facilitates genetic enhancement.

  • Research Article
  • Cite Count Icon 1
  • 10.1093/gbe/evae105
De Novo Genome Assembly and Annotation of Leptosia nina Provide New Insights into the Evolutionary Dynamics of Genes Involved in Host-Plant Adaptation of Pierinae Butterflies.
  • May 2, 2024
  • Genome biology and evolution
  • Yu Okamura + 1 more

In interactions between plants and herbivorous insects, the traits enabling phytophagous insects to overcome chemical defenses of their host plants have evolved multiple times. A prominent example of such adaptive key innovations in herbivorous insects is nitrile specifier proteins (NSPs) that enabled Pierinae butterflies to colonize Brassicales host plants that have a glucosinolate-myrosinase defense system. Although the evolutionary aspects of NSP-encoding genes have been studied in some Pierinae taxa (especially among Pieris butterflies), the ancestral evolutionary state of NSPs is unclear due to the limited genomic information available for species within Pierinae. Here, we generate a high-quality genome assembly and annotation of Leptosia nina, a member of a small tribe, Leptosiaini. L. nina uses as its main host Capparaceae plants, one of the ancestral hosts within Pierinae. By using ∼90-fold coverage of Oxford Nanopore long reads and Illumina short reads for subsequent polishing and error correction, we constructed a final genome assembly that consisted of 286 contigs with a total of 225.8 Mb and an N50 of 10.7 Mb. Genome annotation with transcriptome hints predicted 16,574 genes and covered 98.3% of BUSCO genes. A typical NSP gene is composed of three tandem domains found in Pierinae butterflies; unexpectedly, we found a new NSP-like gene in Pierinae composed of only two tandem domains. This newly found NSP-like gene in L. nina provides important insights into the evolutionary dynamics of domain and gene duplication events relating to host-plant adaptation in Pierinae butterflies.

More from: Journal of Heredity
  • Research Article
  • 10.1093/jhered/esaf013
Packer C. (2023). The Lion Behavior, Ecology, and Conservation of an Iconic Species. Princeton University Press, Princeton and Oxford. xii + 356 pp.
  • Jul 15, 2025
  • Journal of Heredity

  • Research Article
  • 10.1093/jhered/esaf034
Two genomes of the white perch (Morone americana), an ecologically important teleost
  • May 30, 2025
  • Journal of Heredity
  • Josephine R Paris + 6 more

  • Research Article
  • 10.1093/jhered/esaf007
A novel sex-associated genomic region in Catostomus fish species
  • May 29, 2025
  • Journal of Heredity
  • Cassandre B Pyne + 3 more

  • Research Article
  • 10.1093/jhered/esae078
Life history and chromosome organization determine chemoreceptor gene expression in rattlesnakes
  • Apr 28, 2025
  • Journal of Heredity
  • Michael P Hogan + 10 more

  • Research Article
  • 10.1093/jhered/esaf018
Genomic data from the extinct California brown bear suggests a source population for reintroduction to California
  • Apr 24, 2025
  • Journal of Heredity
  • T Brock Wooldridge + 9 more

  • Open Access Icon
  • Research Article
  • 10.1093/jhered/esaf025
A phased chromosome-level genome of the annelid tubeworm Galeolaria caespitosa
  • Apr 22, 2025
  • Journal of Heredity
  • Monique Van Dorssen + 4 more

  • Open Access Icon
  • Research Article
  • Cite Count Icon 1
  • 10.1093/jhered/esaf022
The genome assembly of the duckweed fern, Azolla caroliniana
  • Apr 19, 2025
  • Journal of Heredity
  • Michael J Song + 14 more

  • Research Article
  • 10.1093/jhered/esaf023
A high-quality genome assembly for a desert-adapted rodent, Merriam’s kangaroo rat (Dipodomys merriami)
  • Apr 17, 2025
  • Journal of Heredity
  • Erin R Voss + 11 more

  • Research Article
  • 10.1093/jhered/esaf021
Heritability and genomic basis of age-at-maturity in Chinook Salmon
  • Apr 15, 2025
  • Journal of Heredity
  • Stuart C Willis + 5 more

  • Research Article
  • 10.1093/jhered/esaf010
First reference genomes for two mesophotic, reef-building coral species: Leptoseris cf. scabra and Montipora cf. grisea
  • Mar 22, 2025
  • Journal of Heredity
  • Veronica Z Radice + 3 more

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon