Abstract

Colorectal cancer is the third most common cancer worldwide with abysmal survival, thus requiring novel therapy strategies. Numerous studies have frequently observed infiltrating bacteria within the primary tumor tissues derived from patients. These studies have implicated the relative abundance of these bacteria as a contributing factor in tumor progression. Infiltrating bacteria are believed to be among the major drivers of tumorigenesis, progression, and metastasis and, hence, promising targets for new treatments. However, measuring their abundance directly remains challenging. One potential approach is to use the unmapped reads of host whole genome sequencing (hWGS) data, which previous studies have considered as contaminants and discarded. Here, we developed rigorous bioinformatics and statistical procedures to identify tumor-infiltrating bacteria associated with colorectal cancer from such whole genome sequencing data. Our approach used the reads of whole genome sequencing data of colon adenocarcinoma tissues not mapped to the human reference genome, including unmapped paired-end read pairs and single-end reads, the mates of which were mapped. We assembled the unmapped read pairs, remapped all those reads to the collection of human microbiome reference, and then computed their relative abundance of microbes by maximum likelihood (ML) estimation. We analyzed and compared the relative abundance and diversity of infiltrating bacteria between primary tumor tissues and associated normal blood samples. Our results showed that primary tumor tissues contained far more diverse total infiltrating bacteria than normal blood samples. The relative abundance of Bacteroides fragilis, Bacteroides dorei, and Fusobacterium nucleatum was significantly higher in primary colorectal tumors. These three bacteria were among the top ten microbes in the primary tumor tissues, yet were rarely found in normal blood samples. As a validation step, most of these bacteria were also closely associated with colorectal cancer in previous studies with alternative approaches. In summary, our approach provides a new analytic technique for investigating the infiltrating bacterial community within tumor tissues. Our novel cloud-based bioinformatics and statistical pipelines to analyze the infiltrating bacteria in colorectal tumors using the unmapped reads of whole genome sequences can be freely accessed from GitHub at https://github.com/gutmicrobes/UMIB.git.

Highlights

  • Many microbes inhabit human tissues and bodily fluids, forming a close symbiotic relationship with the host

  • We developed a cloud-based bioinformatics pipeline to analyze unmapped reads from whole genome sequencing of human tumor tissues

  • The reads in the whole genome sequencing data not mapped to the human reference genome were extracted by SAMtools, followed by PANDAseq to assemble overlapping reads, Burrows-Wheeler Aligner (BWA) to remap them to the bacterial genome reference database, and Genome Relative Abundance using Mixture Model theory (GRAMMy) to estimate relative abundance

Read more

Summary

Introduction

Many microbes inhabit human tissues and bodily fluids, forming a close symbiotic relationship with the host. The total number of microbes (approximately 100 trillion) found in the human body is 10 times more than the number of human cells, and the number of genes they encode is 100 times more than that by the human genome Those microbes play an important role in human health by regulating our digestive, immune, respiratory, and nervous system, and their dis-symbiosis has been associated with various diseases (O’Hara and Shanahan, 2006), such as inflammatory bowel disease (Norman et al, 2015), Crohn’s disease (Li et al, 2012), viral hepatitis (Kostic et al, 2012), and colorectal cancer (Littlejohn et al, 2016). DNA damage may be induced in host cells owing to prolonged exposure to these toxins, initiating tumorigenesis (Zhu, 2013). Bacteria and their products can facilitate viral infection in host cells, thereby inducing cancer (Lax and Thomas, 2002; Almand et al, 2017)

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.