Abstract

Bacteria are ubiquitous and live in complex microbial communities. Due to differences in physiological properties and niche preferences among community members, microbial communities respond in specific ways to environmental drivers, potentially resulting in distinct microbial fingerprints for a given environmental state. As proof of the principle, our goal was to assess the opportunities and limitations of machine learning to detect microbial fingerprints indicating the presence of the munition compound 2,4,6-trinitrotoluene (TNT) in southwestern Baltic Sea sediments. Over 40 environmental variables including grain size distribution, elemental composition, and concentration of munition compounds (mostly at pmol⋅g–1 levels) from 150 sediments collected at the near-to-shore munition dumpsite Kolberger Heide by the German city of Kiel were combined with 16S rRNA gene amplicon sequencing libraries. Prediction was achieved using Random Forests (RFs); the robustness of predictions was validated using Artificial Neural Networks (ANN). To facilitate machine learning with microbiome data we developed the R package phyloseq2ML. Using the most classification-relevant 25 bacterial genera exclusively, potentially representing a TNT-indicative fingerprint, TNT was predicted correctly with up to 81.5% balanced accuracy. False positive classifications indicated that this approach also has the potential to identify samples where the original TNT contamination was no longer detectable. The fact that TNT presence was not among the main drivers of the microbial community composition demonstrates the sensitivity of the approach. Moreover, environmental variables resulted in poorer prediction rates than using microbial fingerprints. Our results suggest that microbial communities can predict even minor influencing factors in complex environments, demonstrating the potential of this approach for the discovery of contamination events over an integrated period of time. Proven for a distinct environment future studies should assess the ability of this approach for environmental monitoring in general.

Highlights

  • Microbes are the most diverse, abundant, and ubiquitous life forms on Earth

  • The developed variety of physiologies enables communities to respond in specific ways to environmental drivers, functioning as indicators for surrounding conditions. This principle was demonstrated for very different habitats: it was possible to match individual human skin microbiomes with those on the occupant’s household surfaces (Wilkins et al, 2017), to associate subway microbiomes to the major cities they were located in Ryan (2019) or to distinguish microbial communities in the brackish Baltic Sea along the salinity gradient (Herlemann et al, 2011) and its anoxic regions (Thureborn et al, 2016)

  • In a previous study we demonstrated the identification of glyphosateimpacted free-living community compositions by artificial neural networks (ANN) and Random Forest (RF) after a 82.45 nmol mL−1 glyphosate pulse in a lab microcosm experiment (Janßen et al, 2019b)

Read more

Summary

Introduction

Microbes are the most diverse, abundant, and ubiquitous life forms on Earth. They live in complex microbial communities, which can react rapidly to environmental changes, a result of consistent evolutionary pressures applied by fluctuating conditions (Lindh and Pinhassi, 2018). The developed variety of physiologies enables communities to respond in specific ways to environmental drivers, functioning as indicators for surrounding conditions. This principle was demonstrated for very different habitats: it was possible to match individual human skin microbiomes with those on the occupant’s household surfaces (Wilkins et al, 2017), to associate subway microbiomes to the major cities they were located in Ryan (2019) or to distinguish microbial communities in the brackish Baltic Sea along the salinity gradient (Herlemann et al, 2011) and its anoxic regions (Thureborn et al, 2016). Generation sequencing allows for processing such larger amounts of samples to extract this information, but it might be accompanied by a large portion of irrelevant data with regard to the particular indication

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.