12 Bioinformatic analysis of a large-scale equine microbiome study

M Bowman,R Jacobs,M Gordon,M Lahoti,P Samdani

doi:10.1016/j.jevs.2021.103475

Abstract

As sequencing becomes more cost effective, large-scale microbiome studies will become commonplace. Therefore, methodology for analyzing large amounts of microbiome data would benefit the equine science community. The MQ project (Microbiome Quotient®; Purina Animal Nutrition, Gray Summit, MO) is a large-scale, nation-wide, analysis of the equine microbiome. The aim of this project is to characterize the healthy or normal microbiome of the horse and relate that enterotype with population demographics to build predictive algorithms. Horse owners are sent custom microbiome kits by request and are instructed to collect and send in fecal swabs after completing a survey containing relevant metadata. To date, there are over 2,500 participants and 1,358 of those samples have been sequenced, amounting to 54.7 GB of data over 6 sequencing runs. Due to the scale of the project, a protocol was developed to efficiently process these data. Samples were sequenced on an Illumina MiSeq (Illumina, San Diego, CA). Data were analyzed using QIIME2 (ver. 2020.11) on a dedicated cloud server. Each run was imported separately as a QIIME2 artifact. Read quality was evaluated using the interactive quality plot produced by the q2-demux plugin. Each run was then denoised and filtered through DADA2 separately but with the same truncation parameters based on the average base length where quality dropped below 30. Feature tables and representative sequence files for each run were then merged. The amplicon sequence variants of the combined table were then aligned and used to construct a phylogenetic tree using the q2-alignment (mafft) and q2-phylogny (fasttree2) plugins, respectively. Taxonomic classification was performed with the q2-feature-classifier using a classifier based on the SILVA genomic database. Relative abundance values were calculated from the taxonomic bar plot produced by the taxa visualizer plugin. Diversity metrics of within sample (α-diversity; Shannon's Index) and between samples (β-diversity; Weighted and Unweighted UniFrac) were calculated using the q2-diversity plugin at a sampling depth at 10,000 features. Statistical significance of diversity metrics was also assessed using QIIME2. Shannon's Index and UniFrac measurements were analyzed with Kruskal-Wallis test and PERMANOVA, respectively. Pairwise comparisons were adjusted with the Benjamini & Hotchberg procedure. Relative abundance was assessed using one-way ANOVA in R for each metadata category (ver. 3.6.0). This protocol was effective at producing high quality microbiome data and demonstrates an applicable method for bioinformatic analysis for large-scale equine microbiome studies.

Full Text