Discovering patterns in drug-protein interactions based on their fingerprints

Weimin Luo,Keith Cc Chan

doi:10.1186/1471-2105-13-s9-s4

Weimin Luo, Keith Cc Chan

Open Access

https://doi.org/10.1186/1471-2105-13-s9-s4

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Jun 1, 2012
Citations: 18	License type: CC BY 2.0

Affiliation: Hong Kong Polytechnic University

Abstract

BackgroundThe discovering of interesting patterns in drug-protein interaction data at molecular level can reveal hidden relationship among drugs and proteins and can therefore be of paramount importance for such application as drug design. To discover such patterns, we propose here a computational approach to analyze the molecular data of drugs and proteins that are known to have interactions with each other. Specifically, we propose to use a data mining technique called Drug-Protein Interaction Analysis (D-PIA) to determine if there are any commonalities in the fingerprints of the substructures of interacting drug and protein molecules and if so, whether or not any patterns can be generalized from them.MethodGiven a database of drug-protein interactions, D-PIA performs its tasks in several steps. First, for each drug in the database, the fingerprints of its molecular substructures are first obtained. Second, for each protein in the database, the fingerprints of its protein domains are obtained. Third, based on known interactions between drugs and proteins, an interdependency measure between the fingerprint of each drug substructure and protein domain is then computed. Fourth, based on the interdependency measure, drug substructures and protein domains that are significantly interdependent are identified. Fifth, the existence of interaction relationship between a previously unknown drug-protein pairs is then predicted based on their constituent substructures that are significantly interdependent.ResultsTo evaluate the effectiveness of D-PIA, we have tested it with real drug-protein interaction data. D-PIA has been tested with real drug-protein interaction data including enzymes, ion channels, and protein-coupled receptors. Experimental results show that there are indeed patterns that one can discover in the interdependency relationship between drug substructures and protein domains of interacting drugs and proteins. Based on these relationships, a testing set of drug-protein data are used to see if D-PIA can correctly predict the existence of interaction between drug-protein pairs. The results show that the prediction accuracy can be very high. An AUC score of a ROC plot could reach as high as 75% which shows the effectiveness of this classifier.ConclusionsD-PIA has the advantage that it is able to perform its tasks effectively based on the fingerprints of drug and protein molecules without requiring any 3D information about their structures and D-PIA is therefore very fast to compute. D-PIA has been tested with real drug-protein interaction data and experimental results show that it can be very useful for predicting previously unknown drug-protein as well as protein-ligand interactions. It can also be used to tackle problems such as ligand specificity which is related directly and indirectly to drug design and discovery.

Highlights

The discovering of interesting patterns in drug-protein interaction data at molecular level can reveal hidden relationship among drugs and proteins and can be of paramount importance for such application as drug design
To evaluate the effectiveness of Drug-Protein Interaction Analysis (D-PIA), we have tested it with real drug-protein interaction data
D-PIA has been tested with real drug-protein interaction data including enzymes, ion channels, and protein-coupled receptors

Summary

Introduction

The discovering of interesting patterns in drug-protein interaction data at molecular level can reveal hidden relationship among drugs and proteins and can be of paramount importance for such application as drug design. It might be a mistake in one reaction in a pathway that stops an important protein from being produced or causes too much of it to be produced To correct such mistakes, drug molecules can be developed to interact with target protein molecules to activate or inhibit some of its functions thereby causing a protein to be produced more, or less. To facilitate drug design and discovery, it would be very useful if we can predict whether or not a particular drug candidate may interact with a particular target protein based on its their structures at the molecular or submolecular levels. The finding of such ligand candidate is difficult as protein-ligand docking requires knowledge about the 3D structures of the proteins and obtaining such knowledge can be very difficult [2]

Results

Discussion

Conclusion