Abstract

Long non-coding, tandem-repetitive regions in mitochondrial (mt) genomes of many metazoans have been notoriously difficult to characterise accurately using conventional sequencing methods. Here, we show how the use of a third-generation (long-read) sequencing and informatic approach can overcome this problem. We employed Oxford Nanopore technology to sequence genomic DNAs from a pool of adult worms of the carcinogenic parasite, Schistosoma haematobium, and used an informatic workflow to define the complete mt non-coding region(s). Using long-read data of high coverage, we defined six dominant mt genomes of 33.4 kb to 22.6 kb. Although no variation was detected in the order or lengths of the protein-coding genes, there was marked length (18.5 kb to 7.6 kb) and structural variation in the non-coding region, raising questions about the evolution and function of what might be a control region that regulates mt transcription and/or replication. The discovery here of the largest tandem-repetitive, non-coding region (18.5 kb) in a metazoan organism also raises a question about the completeness of some of the mt genomes of animals reported to date, and stimulates further explorations using a Nanopore-informatic workflow.

Highlights

  • IntroductionPublished: 11 February 2021Mitochondrial (mt) genomes display marked diversity in size and sequence among eukaryotic lineages, ranging from 6 kb in Plasmodium falciparum (malaria parasite) to>11 Mb in Silene conica (catchfly plant) [1,2]

  • Published: 11 February 2021Mitochondrial genomes display marked diversity in size and sequence among eukaryotic lineages, ranging from 6 kb in Plasmodium falciparum to>11 Mb in Silene conica [1,2]

  • Fungi and numerous protists, published evidence indicates that the mt genomes of most metazoans appear to be remarkably compact, with seemingly limited variation in size [3,4,5,6,7]

Read more

Summary

Introduction

Published: 11 February 2021Mitochondrial (mt) genomes display marked diversity in size and sequence among eukaryotic lineages, ranging from 6 kb in Plasmodium falciparum (malaria parasite) to>11 Mb in Silene conica (catchfly plant) [1,2]. Fungi and numerous protists, published evidence indicates that the mt genomes of most metazoans appear to be remarkably compact, with seemingly limited variation in size [3,4,5,6,7]. Non-coding regions are usually reported to be short, apart from a ‘control region’ which often contains tandem-repetitive elements, usually comprising no more than 1.5 kb of the mt genome [8]. Long stretches of repetitive DNA are notoriously difficult to sequence using conventional Sanger- and second-generation (short-read) sequencing methods [12]. Repetitive elements that extend beyond the usual read length capacity of these platforms (~1 kb for Sanger sequencing; 100–300 bp for second-generation methods) cannot be Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call