Genetic diversity, phylogeography, population structure, and demographic history of wild Catla catla at a transboundary scale across South Asia revealed by Mitochondrial COI sequences
This study presents the first assessment of mitochondrial cytochrome c oxidase I (COI) sequences from multiple countries to evaluate the genetic diversity, phylogeographic relationships, population structure, and demographic history of wild Catla catla in South Asia. A total of 18 haplotypes, with moderate haplotype diversity (Hd = 0.599), low nucleotide diversity (π = 0.017), and limited mutational steps among most haplotypes, were identified after analyzing 133 COI sequences collected from Bangladesh, India, and Pakistan. The results revealed low genetic differentiation among all wild Catla samples, influenced by introgression from hatchery-reared fry and population bottlenecks. Phylogenetic analyses identified two distinct haplogroups for Pakistani populations, supporting the existence of divergent mitochondrial lineages. AMOVA test showed that most genetic variation occurred within populations (74.46.%) rather than among the seven river basin populations (25.54%). The high pairwise genetic distance (FST= 0.255), together with the presence of numerous population-specific haplotypes and low gene flow (Nm = 0.729), indicated significant population structure among these river populations. A positive Mantel test (r = 0.12) confirmed a significant increase in genetic divergence with increasing geographic distance. The neutrality test and mismatch distribution presented a contrasting demographic history. A significantly negative Fu’s Fs (Fu’s Fs = −24.431) pointed to recent population expansion, whereas a significant Harpending’s raggedness index (r = 0.009) and a multimodal mismatch distribution suggested long-term demographic substructure. These findings provide essential COI-based baseline genetic information for conserving the genetic integrity of the wild Catla catla and guiding sustainable transboundary fisheries management in South Asia.
- Peer Review Report
- 10.7554/elife.82762.sa1
- Nov 23, 2022
A score-based read selection strategy enables the assembly of novel full-length ribosomal RNA sequences for mosquitoes, which improves the physical and computational removal of interfering ribosomal RNA reads in RNA-seq and provides another molecular marker for taxonomic and phylogenetic inquiries.
- Peer Review Report
- 10.7554/elife.82762.sa2
- Dec 23, 2022
Article Figures and data Abstract Editor's evaluation Introduction Results Discussion Materials and methods Appendix 1 Appendix 2 Data availability References Decision letter Author response Article and author information Metrics Abstract Total RNA sequencing (RNA-seq) is an important tool in the study of mosquitoes and the RNA viruses they vector as it allows assessment of both host and viral RNA in specimens. However, there are two main constraints. First, as with many other species, abundant mosquito ribosomal RNA (rRNA) serves as the predominant template from which sequences are generated, meaning that the desired host and viral templates are sequenced far less. Second, mosquito specimens captured in the field must be correctly identified, in some cases to the sub-species level. Here, we generate mosquito rRNA datasets which will substantially mitigate both of these problems. We describe a strategy to assemble novel rRNA sequences from mosquito specimens and produce an unprecedented dataset of 234 full-length 28S and 18S rRNA sequences of 33 medically important species from countries with known histories of mosquito-borne virus circulation (Cambodia, the Central African Republic, Madagascar, and French Guiana). These sequences will allow both physical and computational removal of rRNA from specimens during RNA-seq protocols. We also assess the utility of rRNA sequences for molecular taxonomy and compare phylogenies constructed using rRNA sequences versus those created using the gold standard for molecular species identification of specimens—the mitochondrial cytochrome c oxidase I (COI) gene. We find that rRNA- and COI-derived phylogenetic trees are incongruent and that 28S and concatenated 28S+18S rRNA phylogenies reflect evolutionary relationships that are more aligned with contemporary mosquito systematics. This significant expansion to the current rRNA reference library for mosquitoes will improve mosquito RNA-seq metagenomics by permitting the optimization of species-specific rRNA depletion protocols for a broader range of species and streamlining species identification by rRNA sequence and phylogenetics. Editor's evaluation Mosquitoes are an important vector for viruses and other pathogens worldwide. However, significant genomic resources are scarce for the study of these species. In this work, the authors create a significant genomic resource that will enable the study of mosquitoes and the pathogens that they carry. https://doi.org/10.7554/eLife.82762.sa0 Decision letter Reviews on Sciety eLife's review process Introduction Mosquitoes top the list of vectors for arthropod-borne diseases, being implicated in the transmission of many human pathogens responsible for arboviral diseases, malaria, and lymphatic filariasis (WHO, 2017). Mosquito-borne viruses circulate in sylvatic (between wild animals) or urban (between humans) transmission cycles driven by different mosquito species with their own distinct host preferences. Although urban mosquito species are chiefly responsible for amplifying epidemics in dense human populations, sylvatic mosquitoes maintain the transmission of these viruses among forest-dwelling animal reservoir hosts and are involved in spillover events when humans enter their ecological niches (Valentine et al., 2019). Given that mosquito-borne virus emergence is preceded by such spillover events, continuous surveillance and virus discovery in sylvatic mosquitoes is integral to designing effective public health measures to pre-empt or respond to mosquito-borne viral epidemics. Metagenomics on field specimens is a powerful method in our toolkit to understand mosquito-borne disease ecology through the One Health lens (Webster et al., 2016). With next-generation sequencing becoming more accessible, such studies have provided unprecedented insights into the interfaces among mosquitoes, their environment, and their animal and human hosts. As mosquito-associated viruses are mostly RNA viruses, RNA sequencing (RNA-seq) is especially informative for surveillance and virus discovery. However, working with lesser studied mosquito species poses several problems. First, metagenomics studies based on RNA-seq are bedevilled by overabundant ribosomal RNAs (rRNAs). These non-coding RNA molecules comprise at least 80% of the total cellular RNA population (Gale and Crampton, 1989). Due to their length and their abundance, they are a sink for precious next-generation sequencing reads, decreasing the sensitivity of pathogen detection unless depleted during library preparation. Yet the most common rRNA depletion protocols require prior knowledge of rRNA sequences of the species of interest as they involve hybridizing antisense oligos to the rRNA molecules prior to removal by ribonucleases (Fauver et al., 2019; Phelps et al., 2021) or by bead capture (Kukutla et al., 2013). Presently, reference sequences for rRNAs are limited to only a handful of species from three genera: Aedes, Culex, and Anopheles (Ruzzante et al., 2019). The lack of reliable rRNA depletion methods could deter mosquito metagenomics studies from expanding their sampling diversity, resulting in a gap in our knowledge of mosquito vector ecology. The inclusion of lesser studied yet medically relevant sylvatic species is therefore imperative. Second, species identification based on morphology is notoriously complicated for members of certain species subgroups. This is especially the case among Culex subgroups. Sister species are often sympatric and show at least some competence for a number of viruses, such as Japanese encephalitis virus, St Louis encephalitic virus, and Usutu virus (Nchoutpouen et al., 2019). Although they share many morphological traits, each of these species have distinct ecologies and host preferences, thus the challenge of correctly identifying vector species can affect epidemiological risk estimation for these diseases (Farajollahi et al., 2011). DNA molecular markers are often employed to a limited degree of success to distinguish between sister species (Batovska et al., 2017; Zittra et al., 2016). To address the lack of full-length rRNA sequences in public databases, we sought to determine the 28S and 18S rRNA sequences of a diverse set of Old and New World sylvatic mosquito species from four countries representing three continents: Cambodia, the Central African Republic, Madagascar, and French Guiana. These countries, due to their proximity to the equator, contain high mosquito biodiversity (Foley et al., 2007) and have had long histories of mosquito-borne virus circulation (Desdouits et al., 2015; Halstead, 2019; Héraud et al., 2022; Jacobi and Serie, 1972; Ratsitorahina et al., 2008; Saluzzo et al., 2017; Zeller et al., 2016). Increased and continued surveillance of local mosquito species could lead to valuable insights on mosquito virus biogeography. Using a unique score-based read filtration strategy to remove interfering non-mosquito rRNA reads for accurate de novo assembly, we produced a dataset of 234 novel full-length 28S and 18S rRNA sequences from 33 mosquito species, 30 of which have never been recorded before. We also explored the functionality of 28S and 18S rRNA sequences as molecular markers by comparing their performance to that of the mitochondrial cytochrome c oxidase subunit I (COI) gene for molecular taxonomic and phylogenetic investigations. The COI gene is the most widely used DNA marker for molecular species identification and forms the basis of the Barcode of Life Data System (BOLD) (Hebert et al., 2003; Ratnasingham and Hebert, 2007). Presently, full-length rRNA sequences are much less represented compared to other molecular markers. However, given the availability of relevant reference sequences, 28S and concatenated 28S+18S rRNA sequences can be the better approach for molecular taxonomy and phylogenetic studies. We hope that our sequence dataset, with its species diversity and eco-geographical breadth, and the assembly strategy we describe would further facilitate the use of rRNA as markers. In addition, this dataset enables the design of species-specific oligos for cost-effective rRNA depletion for a broader range of mosquito species and streamlined molecular species identification during RNA-seq. Results Poor rRNA depletion using a non-specific depletion method During library preparations of mosquito samples for RNA-seq, routinely used methods for depleting rRNA are commercial kits optimised for human or mice samples (Belda et al., 2019; Bishop-Lilly et al., 2010; Chandler et al., 2015; Kumar et al., 2012; Weedall et al., 2015; Zakrzewski et al., 2018) or through 80–100 base pair antisense probe hybridisation followed by ribonuclease digestion (Fauver et al., 2019; Phelps et al., 2021). In cases where the complete reference rRNA sequence of the target species is not known, oligos would be designed based on the rRNA sequence of the closest related species (25, this study). These methods should deplete reads from the conserved regions of rRNA sequences. However, reads from the variable regions remain at abundances high enough to compromise RNA-seq output. In our hands, we have found that using probes designed for the Ae. aegypti rRNA sequence followed by RNase H digestion according to the protocol published by Morlan et al., 2012, produced poor depletion in Aedes albopictus, and in Culicine and Anopheline species (Figure 1), in which between 46% and 94% of reads post-depletion were ribosomal. Additionally, the lack of full-length reference rRNA sequences compromises the in silico clean-up of remaining rRNA reads from sequencing data, as reads belonging to variable regions would not be removed. To solve this and to enable RNA-seq metagenomics on a broader range of mosquito species, we performed RNA-seq to generate reference rRNA sequences for 33 mosquito species representing 10 genera from Cambodia, the Central African Republic, Madagascar, and French Guiana. Most of these species are associated with vector activity for various pathogens in their respective ecologies (Table 1). In parallel, we sequenced the mitochondrial COI gene to perform molecular species identification of our samples and to comparatively evaluate the use of rRNA as a molecular marker (Figure 2). Figure 1 Download asset Open asset Percentage of rRNA reads in mosquito total RNA sequencing (RNA-seq) data after depletion using probes antisense to Aedes aegypti sequences. Pools of five individual mosquitoes from genera Aedes (Ae), Culex (Cx), Mansonia (Ma), and Anopheles (An) were ribodepleted by probe hybridisation followed by RNase H digestion according to the protocol by Morlan et al., 2012. Y-axis depicts percentages of remaining rRNA reads calculated as the number of rRNA reads over total reads per sample pool. Depletion efficiency decreases with taxonomic distance from Ae. aegypti underlining the need for reference sequences for species of interest. Table 1 Mosquito species represented in this study and their vector status. Mosquito taxonomy‡Origin*Collection site (ecosystem type)Vector for†ReferenceAedes (Fredardsius) vittatusCFRural (village)ZIKV, CHIKV, YFVDiallo et al., 2020Aedes (Ochlerotatus) scapularisGFRural (village)YFVVasconcelos et al., 2001Aedes (Ochlerotatus) serratusGFRural (village)YFV, OROVCardoso et al., 2010; Romero-Alvarez and Escobar, 2018Aedes (Stegomyia) aegyptiCFUrbanDENV, ZIKV, CHIKV, YFVKraemer et al., 2019Aedes (Stegomyia) albopictusCF, KHRural (village, nature reserve)DENV, ZIKV, CHIKV, YFV, JEVAuerswald et al., 2021; Kraemer et al., 2019Aedes (Stegomyia) simpsoniCFRural (village)YFVMukwaya et al., 2000Anopheles (Anopheles) baezaiKHRural (nature reserve)Unreported–Anopheles (Anopheles) coustaniMG, CFRural (village)RVFV, malariaMwangangi et al., 2013; Nepomichene et al., 2018; Ratovonjato et al., 2011Anopheles (Cellia) funestusMG, CFRural (village)ONNV, malariaLutomiah et al., 2013; Tabue et al., 2017Anopheles (Cellia) gambiaeMG, CFRural (village)ONNV, malariaBrault et al., 2004Anopheles (Cellia) squamosusMGRural (village)RVFV, malariaRatovonjato et al., 2011; Stevenson et al., 2016Coquillettidia (Rhynchotaenia) venezuelensisGFRural (village)OROVTravassos da Rosa et al., 2017Culex (Culex) antennatusMGRural (village)RVFVNepomichene et al., 2018; Ratovonjato et al., 2011Culex (Culex) duttoniCFRural (village)Unreported–Culex (Culex) neaveiMGRural (village)USUVNikolay et al., 2011Culex (Culex) orientalisKHRural (nature reserve)JEVKim et al., 2015Culex (Culex) perexiguusMGRural (village)WNV, USUVVezenegho et al., 2022Culex (Culex) pseudovishnuiKHRural (nature reserve)JEVAuerswald et al., 2021Culex (Culex) quinquefasciatusMG, CF, KHRural (village, nature reserve)ZIKV, JEV, WNV, DENV, SLEV, RVFV, Wuchereria bancroftiBhattacharya and Basu, 2016; Maquart et al., 2021; Ndiaye et al., 2016; Serra et al., 2016Culex (Culex) tritaeniorhynchusMG, KHRural (village, nature reserve)JEV, WNV, RVFVAuerswald et al., 2021; Hayes et al., 1980; Jupp et al., 2002Culex (Melanoconion) spissipesGFRural (village)VEEVWeaver et al., 2004Culex (Melanoconion) portesiGFRural (village)VEEV, TONVTalaga et al., 2021; Weaver et al., 2004Culex (Melanoconion) pedroiGFRural (village)EEEV, VEEV, MADVTalaga et al., 2021; Turell et al., 2008Culex (Oculeomyia) bitaeniorhynchusMG, KHRural (village, nature reserve)JEVAuerswald et al., 2021Culex (Oculeomyia) poicilipesMGRural (village)RVFVNdiaye et al., 2016Eretmapodites intermediusCFRural (village)Unreported–Limatus durhamiiGFRural (village)ZIKVBarrio-Nuevo et al., 2020Mansonia (Mansonia) titillansGFRural (village)VEEV, SLEVHoyos-López et al., 2015; Turell, 1999Mansonia (Mansonioides) indianaKHRural (nature reserve)JEVArunachalam et al., 2004Mansonia (Mansonioides) uniformisMG, CF, KHRural (village, nature reserve)RVFV, Wuchereria bancroftiLutomiah et al., 2013; Ughasi et al., 2012Mimomyia (Etorleptiomyia) mediolineataMGRural (village)Unreported–Psorophora (Janthinosoma) feroxGFRural (village)ROCVMitchell et al., 1986Uranotaenia (Uranotaenia) geometricaGFRural (village)Unreported– * Dengue virus, DENV; Zika virus, ZIKV; chikungunya virus, CHIKV; Yellow Fever virus, YFV; Oropouche virus, OROV; Japanese encephalitis virus, JEV; Rift Valley Fever virus, RVFV; O’Nyong Nyong virus, ONNV; Usutu virus, USUV; West Nile virus, WNV; St Louis encephalitis virus, SLEV; Venezuelan equine encephalitis virus, VEEV; Tonate virus, TONV; Eastern equine encephalitis virus, EEEV; Madariaga virus, MADV; Rocio virus, ROCV. † Origin countries are listed as their ISO alpha-2 codes: Central African Republic, CF; Cambodia, KH; Madagascar, MG; French Guiana, GF. ‡ Subgenus indicated in brackets. Figure 2 Download asset Open asset Novel mosquito rRNA sequences were obtained using a unique reads filtering method. (A) Schematic of sequencing and bioinformatics analyses performed in this study to obtain full-length 18S and 28S rRNA sequences as well as cytochrome c oxidase I (COI) DNA sequences. Nucleic acids were isolated from mosquito specimens for next-generation (for rRNA) or Sanger (for COI) sequencing. Two in-house libraries were created from the SILVA rRNA gene database: Insecta and Non-Insecta, which comprises 8,585 sequences and 558,185 sequences, respectively. Following BLASTn analyses against these two libraries, each RNA-sequencing (RNA-seq) read is assigned a ratio of BLASTn scores to describe their relative nucleotide similarity to insect rRNA sequences. Based on these ratios of scores, RNA-seq reads can then be filtered to remove non-mosquito reads prior to assembly with SPAdes to give full-length 18S and 28S rRNA sequences. Image created with https://biorender.com/. (B) Based on their ratio of scores, reads can be segregated into four categories, as shown on this ratio of scores versus number of reads plot for the representative specimen ‘CF S27’: (i) reads with hits only in the Insecta library (shaded in green), (ii) reads with a higher score against the Insecta library (shaded in blue), (iii) reads with a higher score against the Non-Insecta library (shaded in yellow), and (iv) reads with no hits in the Insecta library (shaded in red). We applied a conservative threshold at 0.8, indicated by the black horizontal line, where only reads above this threshold are used in the assembly with SPAdes. For this given specimen, 175,671 reads (96.3% of total reads) passed the ≥0.8 cut-off, 325 reads (0.18% of total reads) had ratios of scores <0.8, while 6,423 reads (3.52%) did not have hits against the Insecta library. rRNA reads filtering and sequence assembly Assembling Illumina reads to reconstruct rRNA sequences from total mosquito RNA is not a straightforward task. Apart from host rRNA, total RNA samples also contain rRNA from other organisms associated with the host (microbiota, external parasites, or ingested diet). As rRNA sequences share high homology in conserved regions, Illumina reads (150 bp) from non-host rRNA can interfere with the contig assembly of host 28S and 18S rRNA. Our score-based filtration strategy, described in detail in the Materials and methods section, allowed us to bioinformatically remove interfering rRNA reads and achieve successful de novo assembly of 28S and 18S rRNA sequences for all our specimens. Briefly, for each Illumina read, we computed a ratio of BLAST scores against an Insecta library over scores against a Non-Insecta library (Figure 2A). Based on their ratio of scores, reads could be segregated into four categories (Figure 2B): (i) reads mapping only to the Insecta library, (ii) reads mapping better to the Insecta relative to Non-Insecta library, (iii) reads mapping better to the Non-Insecta relative to the Insecta library, and (iv) reads mapping only to the Non-Insecta library. By applying a conservative threshold at 0.8 to account for the non-exhaustiveness of the SILVA database, we removed reads that likely do not originate from mosquito rRNA. Notably, 15 of our specimens were engorged with vertebrate blood, a rich source of non-mosquito rRNA (Appendix 1—table 1). The successful assembly of complete 28S and 18S rRNA sequences for these specimens demonstrates that this strategy performs as expected even with high amounts of non-host rRNA reads. This is particularly important in studies on field-captured mosquitoes as females are often sampled already having imbibed a blood meal or captured using the human landing catch technique. We encountered challenges for three specimens morphologically identified as Mansonia africana (Specimen ID S33–S35) (Appendix 1—table 1). COI amplification by PCR did not produce any product, hence COI sequencing could not be used to confirm species identity. In addition, the genome assembler SPAdes (Bankevich et al., 2012) was only able to assemble partial length rRNA contigs, despite the high number of reads with high scores against the Insecta library. Among other Mansonia specimens, these partial length contigs shared the highest similarity with contigs obtained from sample ‘Ma uniformis CF S51’. We then performed a guided assembly using the 28S and 18S sequences of this specimen as references, which successfully produced full-length contigs. In two of these specimens (Specimen ID S34 and S35), our assembly initially produced two sets of 28S and 18S rRNA sequences, one of which was similar to mosquito rRNA with low coverage and another with 10-fold higher coverage and 95% nucleotide sequence similarity to a water mite of genus Horreolanus known to parasitize mosquitoes. Our success in obtaining rRNA sequences for mosquito and water mite shows that our strategy can be applied to metabarcoding studies where the input material comprises multiple insect species, provided that appropriate reference sequences of the target species or of a close relative are available. Altogether, we were able to assemble 122 28S and 114 18S full-length rRNA sequences for 33 mosquito species representing 10 genera sampled from four countries across three continents. This dataset contains, to our knowledge, the first records for 30 mosquito species and for seven genera: Coquillettidia, Mansonia, Limatus, Mimomyia, Uranotaenia, Psorophora, and Eretmapodites. Individual GenBank accession numbers for these sequences and specimen information are listed in Appendix 1—table 1. Comparative phylogeny of novel rRNA sequences relative to existing records To verify the assembly accuracy of our rRNA sequences, we constructed a comprehensive phylogenetic tree from the full-length 28S rRNA sequences from our study and relevant rRNA sequences from GenBank (Figure We applied a for GenBank sequences with at least 95% coverage of our sequence to as many species or genera as Although we found records for the species in our the resulting tree that our 28S sequences according to their respective species and by to at with the of and 28S rRNA sequences a with related sequences from Anopheles Anopheles and Anopheles high homology for or other members of (Figure in 28S rRNA sequences a to sister species Culex (Figure in Figure with 2 all Download asset Open asset 28S sequences from this study with or from existing GenBank phylogenetic tree based on full-length 28S sequences bp) from this study and from GenBank was using the method and constructed to in et al., 2018) using an Horreolanus species found among our samples as an at each from from GenBank are with and their accession numbers are For sequences from this each specimen information on and specimen ID specimens produced to two 28S this is indicated by the numbers 1 or 2 at the of the specimen genera are indicated by Culex in Anopheles in Aedes in Mansonia in in in in in in in and in at is Figure data 1 sequence of 28S rRNA sequences from this study and from GenBank Download 28S rRNA phylogenetic (Figure with GenBank Figure this study to that of 18S rRNA sequences (Figure 2). Although all rRNA trees show the of into in and other the phylogenetic relationships between the 28S and 18S rRNA trees and are The 18S rRNA tree also several taxonomic (i) the lack of by species the Culex (ii) the lack of between 18S rRNA sequences of and (iii) the of CF a Culex and (iv) the lack of a (Figure 2). However, 28S and 18S rRNA sequences are by in and should not be when concatenated 28S+18S rRNA sequences were from the specimens (Figure the phylogenetic tree resulting from these sequences more the 28S tree (Figure with to the of the the with in tree in 28S rRNA in concatenated 28S+18S rRNA For were higher in the concatenated tree compared to the 28S the 28S+18S rRNA tree an from genera yellow), Aedes blue), and driven by the inclusion of 18S rRNA sequences. also the found in the 18S rRNA tree and to the close between Culex and Mansonia relative to the 28S tree (Figure the Culex and Mansonia genera are no in the concatenated 28S+18S rRNA tree (Figure Culex is with to of genus Mansonia (Figure and which we to be Mansonia a distinct in 28S or 18S rRNA thus representing a of Figure with 2 all Download asset Open asset 28S and 18S rRNA sequences phylogenetic relationships that are with with higher 28S sequences This phylogenetic tree based on concatenated 28S+18S rRNA sequences bp) from this study was using the method and constructed to using et al., 2018) using an Horreolanus species found among our samples as an at each from specimen information on indicated in and specimen ID specimens produced to two 28S+18S rRNA this is indicated by the numbers 1 or 2 at the of the specimen genera are indicated by Culex in Anopheles in Aedes in Mansonia in in in in in in and in at is Figure data 1 sequence of 122 28S rRNA sequences, two sequences from Horreolanus Download Figure data 2 sequence of 114 18S rRNA sequences, two sequences from Horreolanus Download The concatenated 28S+18S rRNA tree (Figure is known the of our specimens, (i) the of from (ii) the of genus Anopheles into two Anopheles and (iii) the of genus Aedes into and (iv) the of the the Culex genus and 2016). rRNA as a molecular marker for taxonomy and phylogeny We sequenced a of the COI gene to confirm morphological species identification of our specimens and to compare the functionality of rRNA and COI sequences as molecular markers for taxonomic and phylogenetic investigations. COI sequences were able to determine the species in most specimens for the COI sequences from our of specimen shared high nucleotide similarity with several other Anopheles species such as the most and closest In the case of Ae. three specimens had been morphologically identified as Ae. their COI sequences similarity to that of Ae. As GenBank no records of Ae. COI at the of this we aligned the Ae. COI sequences against two sister species of Ae. Ae. and Ae. We found they shared only and respectively. Given this significant we these specimens to be Ae. were especially among Culex specimens belonging to the or where the sequence with of the top two hits by a For between and of the and between and of the Among our three specimens of two to to a species that is different from related to We that these specimens could be based on morphological similarity were not able to verify this by molecular as no COI reference sequence is for this species. These specimens are hence as ‘Ma
- Peer Review Report
- 10.7554/elife.82762.sa0
- Nov 23, 2022
A score-based read selection strategy enables the assembly of novel full-length ribosomal RNA sequences for mosquitoes, which improves the physical and computational removal of interfering ribosomal RNA reads in RNA-seq and provides another molecular marker for taxonomic and phylogenetic inquiries.
- Research Article
2
- 10.1007/s13258-020-00993-x
- Sep 16, 2020
- Genes & genomics
The spiny eel (Sinobdella sinensis) is a small subtropical fish endemic to China, Vietnam, and Laos. It has disappeared in many rivers and lakes due to anthropogenic stressors. The aim of this study was to investigate the genetic diversity and population structure and to provide pertinent information of the evolutionary history and conservation of S. sinensis. Mitochondrial DNA (mtDNA) cytochrome c oxidase I (COI) sequences of 144 individuals from five lakes in the Jiangsu Province of Eastern China were sequenced. A total of 17 haplotypes were defined by 20 variable nucleotide sites. Remarkably low haplotype and nucleotide diversity was observed in all sampled populations. The AMOVA analyses revealed that 96.44% of the genetic variation occurred within the populations. Significant genetic differentiation was detected among populations (P < 0.05), but no large-scale regional differences were detected. Analysis of neutral evolution and mismatch distribution suggests population expansion. Low genetic diversity and shallow population structure based on COI sequences were also confirmed. The wild resource of S. sinensis in Eastern China has sharply declined. Low genetic diversity and shallow population structure based on COI sequences were confirmed. Fishing management and resource conservation of this species should be taken urgently.
- Research Article
1
- 10.1007/s11259-025-10717-9
- Apr 1, 2025
- Veterinary research communications
Fasciola gigantica is an important trematode that affects the health of animals and humans in tropical and subtropical countries, including Malawi. Information on the genetic diversity and population structure of F. gigantica is important to understanding the parasite`s transmission patterns/ and in monitoring the development of resistance to commonly used anthelmintic agents. This study aimed to analyze the genetic diversity and population structure of Fasciola species collected from cattle at slaughter slabs and abattoirs in selected districts of Malawi. A total of 27 adult liver flukes were collected from cattle at slaughter slabs and abattoirs in the northern region (n = 12), central region (n = 5), and southern region (n = 10). The mitochondrial cytochrome c oxidase I (COI) gene and nicotinamide adenine dinucleotide dehydrogenase 1 (ND1) gene were amplified and the amplicons were sequenced for all samples. The sequences obtained were used to investigate genetic diversity through median-joining networks and phylogenetic analysis. Tajima's D test and Fu's Fs statistics were used to determine the population structure. Based on the analyzed COI and ND1 sequences, all samples were identified as F. gigantica. Single nucleotide polymorphisms (SNPs) were identified at 18 and 17 positions for COI and ND1 genes, resulting in 10 and 5 haplotypes, respectively. The haplotype diversities were 0.867 and 0.479 for COI and ND1 gene sequences, respectively. The population genetic structure indices showed a population that has undergone a recent expansion. This study provides baseline epidemiological data on the genetic diversity and population structure of F. gigantica in Malawi; which is important for its control.
- Research Article
- 10.16250/j.32.1374.2024069
- Jun 18, 2024
- Zhongguo xue xi chong bing fang zhi za zhi = Chinese journal of schistosomiasis control
To investigate the origin of Biomphalaria straminea in China, so as to provide insights into assessment of schistosomiasis mansoni transmission risk and B. straminea control. Guanlan River, Dasha River, Shenzhen Reservoir, upper and lower reaches of Kuiyong River, and Xinzhen River in Shenzhen, China, were selected as sampling sites. Ten Biomphalaria samples were collected from each site, and genomic DNA was extracted from Biomphalaria samples. DNA samples were obtained from 15 B. straminea sampled from 5 sampling sites in Minas Gerais State, Pará State, Federal District, Pernambuco State, and Sao Paulo State in Brazil, South America. Cytochrome c oxidase I (COI) and mitochondrial 16S ribosomal RNA (16S rRNA) genes were sampled using the above DNA templates, and the amplified products were sequenced. The COI and 16S rRNA gene sequences were downloaded from GenBank, and the sampling sites were acquired. All COI and 16S rRNA gene sequences were aligned and evolutionary trees of B. straminea were created based on COI and 16S rRNA gene sequences to identify the genetic similarity and evolutionary relationship between B. straminea samples from China and South America. A total of 60 COI gene sequences with a length of 529 bp and 3 haplotypes were obtained from B. straminea sampled from China. There were 165 COI gene sequences of B. straminea retrieved from GenBank, and following alignment with the above 60 gene sequences, a total of 33 haplotypes were obtained. Phylogenetic analysis showed that the three haplotypes of B. straminea from China were clustered into one clade, among which the haplotype China11 and three B. straminea samples from Brazil retrieved from GenBank belonged to the same haplotype. Geographical evolution analysis showed that the B. straminea samples from three sampling sites along eastern coasts of Brazil had the same haplotype with China11, and B. straminea samples from other two sampling sites were closely, genetically related to China11. A total of 60 16S rDNA gene sequences with approximately 322 bp in length were amplified from B. straminea in China, with 2 haplotypes identified. A total of 70 16S rDNA gene sequences of B. straminea were captured from GenBank. Phylogenetic analysis showed that Biomphalaria snails collected from China were clustered into a clade, and the haplotype China64 and the haplotype 229BS from Brazil shared the same haplotype. The 49 16S rDNA gene sequences of B. straminea from 25 sampling sites in southern Brazil, which were captured from GenBank, were included in the present analysis, and the B. straminea from 3 sampling sites shared the same haplotype with China64 in China. Geographical evolution analysis based on COI and 16S rRNA gene sequences showed that B. straminea sampled from eastern coastal areas of Brazil shared the same haplotypes in two gene fragment sequences with Biomphalaria snails collected from China. The Biomphalaria snails in China are characterized as B. straminea, which have a low genetic diversity. The Biomphalaria snails in China have a high genetic similarity with B. straminea sampled from eastern coastal areas of Brazil, which may have originated from the eastern coastal areas of Brazil.
- Research Article
11
- 10.7883/yoken.jjid.2017.537
- Jun 29, 2018
- Japanese Journal of Infectious Diseases
Anopheles (Anopheles) lindesayi Giles consists of 5 subspecies. In Japan, only one subspecies, An. l. japonicus Yamada, has been reported. Its geographical populations are morphologically diverse; however, they are regarded as a single subspecies. In this study, we re-evaluated the taxonomic status of An. l. japonicus in Japan, and that of another subspecies, An. l. pleccau, distributed in Taiwan, by comparative morphological and molecular analyses based on the gene sequences of mitochondrial DNA cytochrome c oxidase I (COI) and ribosomal DNA internal transcribed spacer 2 (ITS2). Nucleotide sequence divergence was calculated using the Kimura-two-parameter (K2P) distance model. Phylogenetic trees based on COI and ITS2 sequences showed 3 distinct clades: Eastern Japan, Western Japan, and the Ryukyus. The sequences of the Ryukyu specimens were located within the same clade as that of the sequences of the Taiwanese specimens. Regarding the COI sequences, the 3 geographical groups in Japan were genetically distinct. The following morphological characteristics distinguished the groups: larval seta 1-S, pupal setae 5 through segments IV-VII, and pupal setae 6 on segments IV-VII. Based on these results, it was revealed that An. l. japonicus included 3 genetically and morphologically distinct groups: 2 groups of An. l. japonicus and a group in the Ryukyus, which was a synonym of An. l. pleccau.
- Research Article
- 10.1080/17451000.2026.2614357
- Mar 16, 2026
- Marine Biology Research
This study examined the population genetic structure, connectivity, and demographic history of the mangrove snail, Terebralia palustris, along the coast of Tanzania through mitochondrial COI sequence analysis. Samples were collected from Tanga, Dar es Salaam, Mtwara, Mafia, Pemba, and Unguja. The extraction of genomic DNA was done from muscle tissue using CTAB method and the COI gene was amplified by PCR amplification with the LCO1490 and HCO2198 primer pair. Subsequent analyses of the COI sequences revealed a moderate haplotype diversity and a low nucleotide diversity. Although Tajima’s D and Fu’s Fs, were negative but non-significant (p > 0.05), a unimodal mismatch distribution and star-like haplotype network provided stronger evidence for recent demographic expansion in the population of T. palustris. Moderate genetic differentiation among populations was revealed by FST value, indicating substantial gene flow and the likelihood of a single panmictic population. The absence of significant genetic structure supports treating the T. palustris fishery along the Tanzanian coast as a single unit stock. Maintaining habitat quality remains important because healthy, connected habitats facilitate gene flow among sites, helping sustain genetic variation and enhancing the species’ long-term resilience.
- Research Article
1
- 10.1186/s13071-025-07134-x
- Nov 24, 2025
- Parasites & Vectors
BackgroundAedes aegypti, the principal vector of dengue and other arboviruses, is widely distributed in Pakistan, yet its population genetics and endosymbiont status remain poorly characterized. This study aimed to investigate the genetic structure, haplotype diversity, and phylogeographic patterns of Ae. aegypti in dengue-endemic regions of Pakistan, and to screen for natural Wolbachia infections to provide baseline data for surveillance and vector control.MethodsOvitrap collections were conducted in 2021 across the provinces of Punjab (Bakkar) and Khyber Pakhtunkhwa (Charsadda, DI Khan, Kohat, and two sites within Peshawar: Hayat Abad and Tarnab). Following the morphological identification of adult Ae. aegypti, we extracted genomic DNA from confirmed specimens to amplify and sequence a 658-bp fragment of the mitochondrial cytochrome c oxidase I (COI) gene. Phylogenetic analyses, haplotype network construction, and population differentiation statistics were performed. Additionally, 300 field-caught adult mosquitoes were screened for Wolbachia using validated conventional and quantitative PCR assays targeting the Wolbachia surface protein (wsp) gene.ResultsPhylogenetic analysis of 166 COI sequences (92 from Pakistan) revealed a monophyletic Ae. aegypti clade with 99.65–100% sequence identity, with Pakistani isolates clustering with those from Saudi Arabia, Iran, and India. In total, 13 global haplotypes were identified, with Hap_3 dominating (53%) and shared across regions. Within Pakistan, eight haplotypes were detected, including region-specific variants, yielding high overall diversity (Hd 0.69; π = 0.007). District-level analysis showed that DI Khan and Bakkar had the highest haplotype diversity (Hd 0.73 and 0.71) but low nucleotide diversity (π = 0.005–0.006), whereas Kohat exhibited no haplotype diversity. Population structure was higher in Pakistan (FST 0.26; Nm 0.7) than globally (FST 0.17; Nm 1.19), consistent with low gene flow among Pakistani populations. No natural Wolbachia infections were detected in Ae. aegypti.ConclusionsAedes aegypti in Pakistan belong to a globally monophyletic lineage and show moderate mitochondrial diversity with higher population structure than the global population. The lack of detected Wolbachia infections suggests that natural strains are either absent or occur at very low prevalence. These findings provide a baseline for surveillance and support integrating Wolbachia-based biocontrol alongside conventional interventions in Pakistan.Graphical Supplementary InformationThe online version contains supplementary material available at 10.1186/s13071-025-07134-x.
- Research Article
1
- 10.4103/apjtm.apjtm_790_23
- Apr 29, 2024
- Asian Pacific Journal of Tropical Medicine
Objective: To address the phylogenetic and phylogeographic relationship between different lineages of Anopheles (An.) subpictus species complex in most parts of the Asian continent by maximum utilization of Internal Transcriber Spacer 2 (ITS2) and cytochrome C oxidase I (COI) sequences deposited at the GenBank. Methods: Seventy-five ITS2, 210 COI and 26 concatenated sequences available in the NCBI database were used. Phylogenetic analysis was performed using Bayesian likelihood trees, whereas median-joining haplotype networks and time-scale divergence trees were generated for phylogeographic analysis. Genetic diversity indices and genetic differentiation were also calculated. Results: Two genetically divergent molecular forms of An. subpictus species complex corresponding to sibling species A and B are established. Species A evolved around 37-82 million years ago in Sri Lanka, India, and the Netherlands, and species B evolved around 22-79 million years ago in Sri Lanka, India, and Myanmar. Vietnam, Thailand, and Cambodia have two molecular forms: one is phylogenetically similar to species B. Other forms differ from species A and B and evolved recently in the above mentioned countries, Indonesia and the Philippines. Genetic subdivision among Sri Lanka, India, and the Netherlands is almost absent. A substantial genetic differentiation was obtained for some populations due to isolation by large geographical distances. Genetic diversity indices reveal the presence of a long-established stable mosquito population, at mutation-drift equilibrium, regardless of population fluctuations. Conclusions: An. subpictus species complex consists of more than two genetically divergent molecular forms. Species A is highly divergent from the rest. Sri Lanka and India contain only species A and B.
- Research Article
144
- 10.1186/1742-9994-6-29
- Jan 1, 2009
- Frontiers in Zoology
BackgroundThe Palearctic region supports relatively few avian species, yet recent molecular studies have revealed that cryptic lineages likely still persist unrecognized. A broad survey of cytochrome c oxidase I (COI) sequences, or DNA barcodes, can aid on this front by providing molecular diagnostics for species assignment. Barcodes have already been extensively surveyed in the Nearctic, which provides an interesting comparison to this region; faunal interchange between these regions has been very dynamic. We explored COI sequence divergence within and between species of Palearctic birds, including samples from Russia, Kazakhstan, and Mongolia. As of yet, there is no consensus on the best method to analyze barcode data. We used this opportunity to compare and contrast three different methods routinely employed in barcoding studies: clustering-based, distance-based, and character-based methods.ResultsWe produced COI sequences from 1,674 specimens representing 398 Palearctic species. These were merged with published COI sequences from North American congeners, creating a final dataset of 2,523 sequences for 599 species. Ninety-six percent of the species analyzed could be accurately identified using one or a combination of the methods employed. Most species could be rapidly assigned using the cluster-based or distance-based approach alone. For a few select groups of species, the character-based method offered an additional level of resolution. Of the five groups of indistinguishable species, most were pairs, save for a larger group comprising the herring gull complex. Up to 44 species exhibited deep intraspecific divergences, many of which corresponded to previously described phylogeographic patterns and endemism hotspots.ConclusionCOI sequence divergence within eastern Palearctic birds is largely consistent with that observed in birds from other temperate regions. Sequence variation is primarily congruent with taxonomic boundaries; deviations from this trend reveal overlooked biological patterns, and in some cases, overlooked species. More research is needed to further refine the taxonomic status of some Palearctic birds, but large genetic surveys such as this may facilitate this effort. DNA barcodes are a practical means for rapid species assignment, although efficient analytical methods will likely require a two-tiered approach to differentiate closely related pairs of species.
- Research Article
43
- 10.1007/s10750-014-2161-5
- Jan 7, 2015
- Hydrobiologia
The potential use of Cytochrome c Oxidase I (COI)-DNA barcode sequences for the molecular identification of lanternfish larvae from the Sicilian Channel was investigated at two levels: at an interspecific level by confirming species identification based on morphological characters; and at an intraspecific level to test for the presence of geographical variation of COI-DNA sequences. A reference library of COI sequences was constructed starting from unambiguously identified specimens. Neighbor-joining analysis based on K2P genetic distances formed non-overlapping clusters for all species (Myctophum punctatum, Ceratoscopelus maderensis, Hygophum benoiti, Electrona risso and Lobianchia dofleini) with a 100% bootstrap support for each. Additional COI sequences of lanternfishes from Atlantic Ocean and Balearic Sea samples deposited in BOLD system database were included in the dataset. The present analysis allowed the identification of unknown fish larvae and indicated that there is a relative congruence between morphological and molecular identification approaches. Our preliminary data in Myctophidae species confirm that COI gene can be used as an efficient species-specific marker that is also useful for assessing the geographical provenance of larvae. This information will likely be applicable to the investigation of the population structure in these poorly studied species.
- Research Article
9
- 10.1016/j.heliyon.2022.e08846
- Jan 28, 2022
- Heliyon
This study aims to taxonomically identify and characterise the phylogenetic relationships of spiny lobsters based on mitochondrial cytochrome c oxidase I (COI) and 16S rRNA genes from Bangladesh waters. A total of 19 barcode sequences (10 partial COI sequences and 9 partial 16S rRNA) were successfully generated from 12 collected spiny lobster samples representing four species belonging to the family Palinuridae. The average genetic distances within and between species were 0.834 ± 0.427 and 17.810 ± 0.830, respectively, in COI and 0.107 ± 0.255 and 8.401 ± 2.547, respectively, in 16S rRNA genes. The successful amplification rate of 16S rRNA was higher than that of the COI marker. In the maximum likelihood (ML) tree, the sequences of the same species were clustered together under a single clade for both COI and 16S rRNA, which supports the efficacy of both marker genes in differentiating lobster species.
- Research Article
2
- 10.2139/ssrn.3906488
- Jan 1, 2021
- SSRN Electronic Journal
This study aims to taxonomic identification, molecular characterization and phylogenetic relationship of spiny lobsters based on mitochondrial cytochrome c oxidase I (COI) and the 16S rRNA genes from Bangladesh waters. A total of 19 barcode sequences (10 partial COI sequences and 9 partial 16S rRNA) has successfully generated from 12 collected spiny lobster samples representing four species belonging family Palinuridae. The average genetic distance was found within and between species were 0.834±0.427 and 17.81±0.830, respectively in COI and 0.107±0.255 and 8.401±2.547, respectively in 16S rRNA genes. The successful amplification rate of 16S rRNA was higher than the COI marker. In the Maximum likelihood (ML) tree, the sequences of the same species were clustered together under a single clade for both COI and 16S rRNA which supports the efficacy of both marker genes in differentiating lobsters’ species.
- Research Article
36
- 10.1111/zoj.12044
- Jul 26, 2013
- Zoological Journal of the Linnean Society
Exploring phylogenetic informativeness and nuclear copies of mitochondrial DNA (numts) in three commonly used mitochondrial genes: mitochondrial phylogeny of peppermint, cleaner, and semi-terrestrial shrimps (Caridea:<i>Lysmata</i>,<i>Exhippolysmata</i>, and<i>Merguia</i>)