Abstract

BackgroundMycobacterium tuberculosis is an infectious bacterium posing serious threats to human health. Due to the difficulty in performing molecular biology experiments to detect protein interactions, reconstruction of a protein interaction map of M. tuberculosis by computational methods will provide crucial information to understand the biological processes in the pathogenic microorganism, as well as provide the framework upon which new therapeutic approaches can be developed.ResultsIn this paper, we constructed an integrated M. tuberculosis protein interaction network by machine learning and ortholog-based methods. Firstly, we built a support vector machine (SVM) method to infer the protein interactions of M. tuberculosis H37Rv by gene sequence information. We tested our predictors in Escherichia coli and mapped the genetic codon features underlying its protein interactions to M. tuberculosis. Moreover, the documented interactions of 14 other species were mapped to the interactome of M. tuberculosis by the interolog method. The ensemble protein interactions were validated by various functional relationships, i.e., gene coexpression, evolutionary relationship and functional similarity, extracted from heterogeneous data sources. The accuracy and validation demonstrate the effectiveness and efficiency of our framework.ConclusionsA protein interaction map of M. tuberculosis is inferred from genetic codons and interologs. The prediction accuracy and numerically experimental validation demonstrate the effectiveness and efficiency of our method. Furthermore, our methods can be straightforwardly extended to infer the protein interactions of other bacterial species.

Highlights

  • Mycobacterium tuberculosis is an infectious bacterium posing serious threats to human health

  • An extensive proteinprotein interaction (PPI) network of M. tuberculosis can lead to more comprehensive screens of cellular operations

  • Predictor performance E. coli is one of the best characterized organisms [3,4] and we chose it as a model system for building the protein interaction map of M. tuberculosis

Read more

Summary

Results

Predictor performance E. coli is one of the best characterized organisms [3,4] and we chose it as a model system for building the protein interaction map of M. tuberculosis. These results provide pieces of evidence for the effectiveness and efficiency of predicting protein interactions from the genetic codons by machine learning method. From the validations of gene coexpression, evolutionary relationship in COGs and functional similarity, we can check and filter out those pairs consistently included in various level information by evaluating the reliability of interactions. Filtering interactions by different confidence values result in different networks of different size and reliability This will provide valuable resources for biological information in tuberculosis research, which implies the promising applications based on our constructed protein interaction map, which are our future research topics. The gene sequence information of interacting pair of proteins has been learned by the predictor and that of these known interactions is mapped to the protein pairs of M. tuberculosis. It is an interesting topic to investigate the prediction difference of the two-level sequence information

Conclusions
Background
Conclusion
Methods
30. Vapnik V: The Nature of Statistical Learning Theory New York
34. Gene Ontology Consortium
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call