Abstract
MotivationProbabilistic Identification of bacterial essential genes using transposon-directed insertion-site sequencing (TraDIS) data based on Tn5 libraries has received relatively little attention in the literature; most methods are designed for mariner transposon insertions. Analysis of Tn5 transposon-based genomic data is challenging due to the high insertion density and genomic resolution. We present a novel probabilistic Bayesian approach for classifying bacterial essential genes using transposon insertion density derived from transposon insertion sequencing data. We implement a Markov chain Monte Carlo sampling procedure to estimate the posterior probability that any given gene is essential. We implement a Bayesian decision theory approach to selecting essential genes. We assess the effectiveness of our approach via analysis of both simulated data and three previously published Escherichia coli, Salmonella Typhimurium and Staphylococcus aureus datasets. These three bacteria have relatively well characterized essential genes which allows us to test our classification procedure using receiver operating characteristic curves and area under the curves. We compare the classification performance with that of Bio-Tradis, a standard tool for bacterial gene classification.ResultsOur method is able to classify genes in the three datasets with areas under the curves between 0.967 and 0.983. Our simulated synthetic datasets show that both the number of insertions and the extent to which insertions are tolerated in the distal regions of essential genes are both important in determining classification accuracy. Importantly our method gives the user the option of classifying essential genes based on the user-supplied costs of false discovery and false non-discovery.Availability and implementationAn R package that implements the method presented in this paper is available for download from https://github.com/Kevin-walters/insdens.Supplementary information Supplementary data are available at Bioinformatics online.
Highlights
Bacterial essential genes are those required for growth and survival
Motivation: Probabilistic Identification of bacterial essential genes using transposon-directed insertion-site sequencing (TraDIS) data based on transposon 5 (Tn5) libraries has received relatively little attention in the literature; most methods are designed for mariner transposon insertions
We present a novel probabilistic Bayesian approach for classifying bacterial essential genes using transposon insertion density derived from transposon insertion sequencing data
Summary
Bacterial essential genes are those required for growth and survival (i.e. viability). The genes that are required under almost all growth conditions are known to be generally or unconditionally essential. Such genes perform essential functions that include fundamental processes like the DNA replication required in all organisms, as well as other essential functions required for the organism’s particular lifestyle (Chao et al, 2016). The development of genome-wide experimental approaches to identify essential bacterial or virulence genes for in vivo survival has seen considerable progress, which could yield potential drug targets (Friedman and Hughes, 2003)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.