Abstract

The quantification of richness within a sample—either measured as the number of observed species or approximated by estimation—is a common first step in microbiome studies and is known to be highly dependent on sequencing depth, which itself is highly variable between samples. Rarefaction curves serve as a tool to investigate this dependency and it is often argued that after rarefying data—sub-sampling to an equal sequencing depth—richness estimates no longer depend on sequencing depth. However, the estimation of richness from data obtained by high throughput sequencing methods and processed by current bioinformatics pipelines may be subject to various sources of variation related to sequencing depth. Those that may confound inference based on richness estimates and cannot be solved by sub-sampling. We investigated how pipeline settings in DADA2 and deblur affect estimates of richness and showed that the use of rarefaction and sub-sampling is inappropriate when default pipeline settings are applied. Furthermore, we showed how independent sample-wise processing established spurious correlations between sequencing depth and richness estimations in data produced by DADA2 and how this problem can be solved by a pooled processing approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.