Abstract

BackgroundMicrobes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. However, city-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. In this paper, we introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns.ResultsWe evaluate the effectiveness of MetaMLAnn based on the public metagenomics dataset collected from multiple locations in the New York and Boston subway systems. The experimental results suggest that MetaMLAnn consistently performs better than other five conventional classifiers under different taxonomic ranks. At genus level, MetaMLAnn can achieve F1 scores of 0.63 and 0.72 on the New York and the Boston datasets, respectively.ConclusionsBy exploiting heterogeneous features, MetaMLAnn captures the hidden interactions between microbial compositions and the urban environment, which enables precise predictions of microbial communities at unmeasured locations.

Highlights

  • Microbes are greatly associated with human health and disease, especially in densely populated cities

  • In our recent work [15], we propose MetaMLAnn (Metagenomic Multi Label Artifical neural network), a neural network based and supervised learning model to predict the microbial community for city-scale metagenomics

  • In the New York dataset, MetaMLAnn and MetaMLAnn+ perform the best in terms of F1 score and ranking loss, though the precision and recall of MetaMLAnn rank second among other baselines

Read more

Summary

Introduction

Microbes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. City-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. We introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call