Abstract

Diatom biomonitoring and ecological studies can greatly benefit from DNA metabarcoding compared to conventional microscopical analysis by potentially providing more reliable and accurate data in a cost- and time-efficient way. A conventional strategy for the bioinformatic treatment of sequencing data involves the clustering of quality filtered sequences into Operational Taxonomic Units (OTUs) based on a global sequence similarity, and their assignment to taxonomy using a reference library. Then, the obtained species lists of the successfully assigned taxa are used for subsequent analyses or quality index calculation. However, the high diversity of bioinformatic methods and parameters make inter-studies comparison difficult, especially because OTUs are specific to a given study. Clustering sequences into OTUs aims to reduce the biasing effect of sequencing artefacts and to reach an approximate species level delimitation at the price of potentially grouping together sequences with different ecology. A similar bias occurs when sequences that differ from each other by their ecological preference are assigned to the same taxa. The incompleteness of reference libraries can further introduce a bias by not taking into account unassigned sequences, thus losing the ecological information they possess. In order to overcome these biases, our studies tested new approaches on de novo developed diatom indices based on periphytic samples collected from streams in France and Hungary. Index development was performed with the leave-one-out cross validation (LOOCV) technique by building a model on a training dataset containing n-1 samples and testing it on the remaining test sample. Test values were correlated with a reference environmental gradient. The model was based on the calculation of optimum and tolerance of taxonomic units along the reference gradient and a modified Zelinka-Marvan diatom index equation. Taxonomic units tested in the studies were morphospecies, OTUs (95% similarity threshold), Individual Sequence Units (ISUs, via minimal bioinformatic quality filtering) and Exact Sequence Variants (ESVs, via DADA2 denoising algorithm). The “clustering-free” approach (ISU- and ESV-based indices) performed better than the OTU-based one, providing a fine taxonomic resolution where the ecological difference on genetically close sequence variants could be detected. Thus, these indices are more adapted to a standardized and comparable routine bioassessment. The “taxonomy-free” approach revealed the ecological preferences for those molecular taxonomic units (ISUs/ESVs) that otherwise either (i) would have been assigned to the same taxa due to genetic similarity, or (ii) would not have been recognized because of their absence from the reference libraries. However, we also found that taxonomic information cannot be neglected in ecological studies when the presence of organisms under particular environmental conditions is to be explained or interpreted e.g. via the traits they possess. New types of clustering methods are welcome in the future of biomonitoring where the delimitation of taxonomic units should be refined based on a higher emphasis on their ecology rather than on morphological or genetical criteria.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.