Abstract
BackgroundAssignment of chemical compounds to biological pathways is a crucial step to understand the relationship between the chemical repertory of an organism and its biology. Protein sequence profiles are very successful in capturing the main structural and functional features of a protein family, and can be used to assign new members to it based on matching of their sequences against these profiles. In this work, we extend this idea to chemical compounds, constructing a profile-inspired model for a set of related metabolites (those in the same biological pathway), based on a fragment-based vectorial representation of their chemical structures.ResultsWe use this representation to predict the biological pathway of a chemical compound with good overall accuracy (AUC 0.74–0.90 depending on the database tested), and analyzed some factors that affect performance. The approach, which is compared with equivalent methods, can in addition detect those molecular fragments characteristic of a pathway.ConclusionsThe method is available as a graphical interactive web server http://csbg.cnb.csic.es/iFragMent.
Highlights
Assignment of chemical compounds to biological pathways is a crucial step to understand the relationship between the chemical repertory of an organism and its biology
Our results show that the method proposed predicts biological pathways for chemical compounds with global area under the Receiving operating characteristic curve (ROC) curve (AUC) (Area Under the Curve) ranging from 0.74 to 0.90 depending of the database considered
We evaluated the performance of the method using a tenfold cross-validation approach
Summary
Assignment of chemical compounds to biological pathways is a crucial step to understand the relationship between the chemical repertory of an organism and its biology. Studying the roles of chemical compounds in a cellular context is fundamental for understanding living systems at the molecular level [1]. This can be achieved with experimental and computational approaches. Most biological pathways contain a reduced number of metabolites, what could be a problem for machine learning approaches.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.