Abstract

Abstract Introduction: Many biological factors can influence the human microbiome; thus, it is necessary to reasonably estimate sample size to have adequate power for future population-based studies. Methods: We evaluated the temporal stability of fecal microbial diversity, species level composition, and genes and functional pathways based on shallow shotgun metagenome sequencing from a population-based study in Costa Rica with two fecal samples separated by an interval of six months. We quantified the biological variability over six months with intraclass correlation coefficients (ICC). We then estimated the numbers of cases, assuming 1:1 or 1:3 matched case-control study, required to detect an association at significance levels of 0.05 and 0.001 with 80% power, based on the number of fecal specimens collected per participant. Results: For most alpha and beta diversity metrics, the temporal stability of fecal microbiome in samples collected at six months of interval was low to moderate, with ICCs of 0.6 or less. We observed heterogeneity in temporal stability for the proportions of species, genes and pathways, with ICCs varying between 0.0 and 0.9. For most microbiome measures from a single fecal collection, assuming an equal number of cases and controls at significance level of 0.05 (for alpha and beta diversity) or 0.001 (for species, genes and pathways), detecting an odds ratio of 1.5 per standard deviation of microbiome metric would require hundreds to thousands of cases. Specifically, detecting an association between alpha/beta diversity metrics and an outcome would require between 1,000-5,000 cases. In addition, for low prevalent species (5% - 10%, median ICC = 0.09), 15,102 cases would be required; in contrast, for species with high prevalence (>75%, median ICC = 0.41) 3,527 cases would be necessary; the same applies to genes and pathways. For an odds ratio of 1.5, assuming a 1:3 matched case-control study based on one fecal specimen per subject, 10,068 cases would be required for low prevalent species and 2,351 cases for species with high prevalence. Using the same study setting with multiple specimens per subject over time, the required sample size would be lower. Indeed, detecting an odds ratio of 1.5 for low prevalent species would require 15,102 cases with one specimen, 8,267 cases with two specimens, and 5,989 cases with three specimens. Conclusion: Our calculations suggest that to detect modest disease associations we would need a substantial number of cases. Repeated prediagnostic samples could decrease the number of subjects required to detect these associations as well as matching cases to a great number of controls. Citation Format: Semi Zouiouich, Smriti Karwa, Yunhu Wan, Emily Vogtmann, Carolina Porras, Christian C. Abnet, Jianxin Shi, Rashmi Sinha. Sample size estimations based on human microbiome temporal stability over six months: A shallow shotgun metagenome sequencing analysis [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 2174.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call