Abstract
BackgroundMicrobes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. However, city-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. In this paper, we introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns.ResultsWe evaluate the effectiveness of MetaMLAnn based on the public metagenomics dataset collected from multiple locations in the New York and Boston subway systems. The experimental results suggest that MetaMLAnn consistently performs better than other five conventional classifiers under different taxonomic ranks. At genus level, MetaMLAnn can achieve F1 scores of 0.63 and 0.72 on the New York and the Boston datasets, respectively.ConclusionsBy exploiting heterogeneous features, MetaMLAnn captures the hidden interactions between microbial compositions and the urban environment, which enables precise predictions of microbial communities at unmeasured locations.
Highlights
Microbes are greatly associated with human health and disease, especially in densely populated cities
In our recent work [15], we propose MetaMLAnn (Metagenomic Multi Label Artifical neural network), a neural network based and supervised learning model to predict the microbial community for city-scale metagenomics
In the New York dataset, MetaMLAnn and MetaMLAnn+ perform the best in terms of F1 score and ranking loss, though the precision and recall of MetaMLAnn rank second among other baselines
Summary
Microbes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. City-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. We introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have