Abstract

Parasites often have complex developmental cycles that account for their presence in a variety of difficult-to-analyze matrices, including feces, water, soil, and food. Detection of parasites in these matrices still involves laborious methods. Untargeted sequencing of nucleic acids extracted from those matrices in metagenomic projects may represent an attractive alternative method for unbiased detection of these pathogens. Here, we show how publicly available metagenomic datasets can be mined to detect parasite specific sequences, and generate data useful for environmental surveillance. We use the protozoan parasite Cryptosporidium parvum as a test organism, and show that detection is influenced by the reference sequence chosen. Indeed, the use of the whole genome yields high sensitivity but low specificity, whereas specificity is improved through the use of signature sequences. In conclusion, querying metagenomic datasets for parasites is feasible and relevant, but requires optimization and validation. Nevertheless, this approach provides access to the large, and rapidly increasing, number of datasets from metagenomic and meta-transcriptomic studies, allowing unlocking hitherto idle signals of parasites in our environments.

Highlights

  • Parasites are eukaryotic pathogens, broadly divided into single cell and multicellular organisms, which cause infection and disease in vertebrate hosts

  • In the case of the 18S rDNA sequence from Cryptosporidium spp., over 1,500 reads from different environments aligned with the reference sequence, but only four reads were confirmed to be specific for C. parvum; these reads were from a calf metagenome study (MG-RAST project numbers 4537110.3, 4536848.3, 4536849.3, and 4537108.3)

  • To expand evidence of parasite DNA sequences in metagenomic samples obtained by using the 18S rDNA as reference, the MG-RAST project numbers that were positive for Cryptosporidium from wastewater/sludge and those from host-associated environment, were queried again with the same settings as described above, but this time using the whole genome sequence of C. parvum Iowa II strain as reference

Read more

Summary

Introduction

Broadly divided into single cell (protozoa) and multicellular organisms (nematodes, cestodes, and trematodes), which cause infection and disease in vertebrate hosts. We started by using the small subunit ribosomal DNA (18S rDNA) sequence to query metagenome data available at the MG-RAST database, and we choose an example of a parasite expected to occur in one environment only, namely Entamoeba gingivalis in the human oral cavity.

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call