Network-based characterization of drug-protein interaction signatures with a space-efficient approach
BackgroundCharacterization of drug-protein interaction networks with biological features has recently become challenging in recent pharmaceutical science toward a better understanding of polypharmacology.ResultsWe present a novel method for systematic analyses of the underlying features characteristic of drug-protein interaction networks, which we call “drug-protein interaction signatures” from the integration of large-scale heterogeneous data of drugs and proteins. We develop a new efficient algorithm for extracting informative drug-protein interaction signatures from the integration of large-scale heterogeneous data of drugs and proteins, which is made possible by space-efficient representations for fingerprints of drug-protein pairs and sparsity-induced classifiers.ConclusionsOur method infers a set of drug-protein interaction signatures consisting of the associations between drug chemical substructures, adverse drug reactions, protein domains, biological pathways, and pathway modules. We argue the these signatures are biologically meaningful and useful for predicting unknown drug-protein interactions and are expected to contribute to rational drug design.
- Research Article
57
- 10.1186/s40360-017-0153-6
- Jun 8, 2017
- BMC Pharmacology & Toxicology
BackgroundThe expanded use of multiple drugs has increased the occurrence of adverse drug reactions (ADRs) induced by drug-drug interactions (DDIs). However, such reactions are typically not observed in clinical drug-development studies because most of them focus on single-drug therapies. ADR reporting systems collect information on adverse health effects caused by both single drugs and DDIs. A major challenge is to unambiguously identify the effects caused by DDIs and to attribute them to specific drug interactions. A computational method that provides prospective predictions of potential DDI-induced ADRs will help to identify and mitigate these adverse health effects.MethodWe hypothesize that drug-protein interactions can be used as independent variables in predicting ADRs. We constructed drug pair-protein interaction profiles for ~800 drugs using drug-protein interaction information in the public domain. We then constructed statistical models to score drug pairs for their potential to induce ADRs based on drug pair-protein interaction profiles.ResultsWe used extensive clinical database information to construct categorical prediction models for drug pairs that are likely to induce ADRs via synergistic DDIs and showed that model performance deteriorated only slightly, with a moderate amount of false positives and false negatives in the training samples, as evaluated by our cross-validation analysis. The cross validation calculations showed an average prediction accuracy of 89% across 1,096 ADR models that captured the deleterious effects of synergistic DDIs. Because the models rely on drug-protein interactions, we made predictions for pairwise combinations of 764 drugs that are currently on the market and for which drug-protein interaction information is available. These predictions are publicly accessible at http://avoid-db.bhsai.org. We used the predictive models to analyze broader aspects of DDI-induced ADRs, showing that ~10% of all combinations have the potential to induce ADRs via DDIs. This allowed us to identify potential DDI-induced ADRs not yet clinically reported. The ability of the models to quantify adverse effects between drug classes also suggests that we may be able to select drug combinations that minimize the risk of ADRs.ConclusionAlmost all information on DDI-induced ADRs is generated after drug approval. This situation poses significant health risks for vulnerable patient populations with comorbidities. To help mitigate the risks, we developed a robust probabilistic approach to prospectively predict DDI-induced ADRs. Based on this approach, we developed prediction models for 1,096 ADRs and used them to predict the propensity of all pairwise combinations of nearly 800 drugs to be associated with these ADRs via DDIs. We made the predictions publicly available via internet access.
- Research Article
28
- 10.1002/j.1552-4604.1993.tb04663.x
- Apr 1, 1993
- The Journal of Clinical Pharmacology
Adverse drug reactions are common and troublesome complications of contemporary pharmacotherapy. Adverse drug reactions are frequently, and often incorrectly, referred to as "allergy". Although there are multiple mechanisms for adverse drug reactions, adverse drug reactions mediated by the immune system account for a disproportionate number of fatal and serious adverse reactions, and constitute a major clinical problem for patients and physicians. The immune system has evolved in multicellular organisms as a defence against infection. Interactions between drugs and the immune system occur as inadvertent consequences of the protective function of the immune system, with drug molecules or drug-carrier haptens being recognized as "non-self" by the immune system. The classical mechanisms for drug hypersensitivity described by Gell and Coombs (Types 1 to 4) include IgE-mediated, cytotoxic, immune complex-mediated and delayed mechanism. These mechanisms provide elegant models for drug-immune interactions that can provide mechanistic explanations for events such as urticaria associated with penicillins. However, these mechanisms do not account for many of the immunologically mediated adverse reactions commonly encountered in clinical practice. Over the last two decades, there has been an increasing awareness of the importance of reactive drug metabolites and drug-protein interactions in the initiation of immunologic events mediating adverse drug reactions. Reactive drug metabolites may produce direct and profound effects on various functions of the immune system. Although some adverse reactions mediated by the immune system occur with equal frequency among adults and children, some of these reactions appear to be markedly more common among children than adults.(ABSTRACT TRUNCATED AT 250 WORDS)
- Research Article
19
- 10.3233/978-1-61499-658-3-387
- Jan 1, 2016
- Studies in health technology and informatics
Cancer is the number one cause of death in Australia with colorectal cancer being the second most common cancer type. The translation of cancer research into clinical practice is hindered by the lack of integration of heterogeneous and autonomous data from various data sources. Integration of heterogeneous data can offer researchers a comprehensive source for biospecimen identification, hypothesis formulation, hypothesis validation, cohort discovery and biomarker discovery. Alongside the increasing prominence of big data, various translational research tools such as tranSMART have emerged that can converge and analyse different types of data. In this study, we show the integration of different data types from a significant Australian colorectal cancer cohort. Additionally, colorectal cancer datasets from The Cancer Genome Atlas were also integrated for comparison. These integrated data are accessible via http://www.tcrn.unsw.edu.au/transmart. The use of translational research tools for data integration can provide a cost-effective and rapid approach to translational cancer research.
- Book Chapter
4
- 10.1007/978-3-030-43192-1_101
- Jan 1, 2020
Due to the enormous growth of information technology, a huge amount of big data is produced daily, wherein heterogeneity is considered as the main feature of big data. Heterogeneous data integration is still remaining as a bottleneck. It becomes as a very difficult task to integrate and complete the business information demands. Hence, in this research work we have presented a novel Heterogeneous Data Integration and Analysis framework for solving the challenges associated with heterogeneous big data. Big data analysis is an information extraction technique generally used by organizations for business intelligence. However, data mining doesn’t provide good performance for very large data set due to the problems of high computational cost and lack of memory. In this article, we have proposed Convolutional Neural Networks (CNN) architecture for heterogeneous big data analysis. Finally, experimental results make it clear that the proposed method is the fastest data integration framework and that it is also considered as a good analysis model for business.
- Abstract
- 10.1136/annrheumdis-2022-eular.2763
- May 23, 2022
- Annals of the Rheumatic Diseases
BackgroundResearch regarding adverse drug reactions (ADRs) associated with the use of adalimumab in patients with inflammatory rheumatic diseases (IRDs) usually focuses on the nature and frequency of ADRs without considering...
- Abstract
1
- 10.1136/annrheumdis-2022-eular.2736
- May 23, 2022
- Annals of the Rheumatic Diseases
BackgroundResearch regarding adverse drug reactions (ADRs) associated with the use of etanercept in patients with inflammatory rheumatic diseases (IRDs) usually focuses on the nature and frequency of ADRs without considering...
- Book Chapter
1
- 10.1007/978-94-017-1769-4_6
- Jan 1, 2002
One of the most challenging prospects facing the Data Warehouse (DW) project is the integration of heterogeneous data from different sources. The database community has spent years researching the integration of heterogeneous and distributed data. Heterogeneous Database Management Systems (HDBMS) have been presented as one of the solutions to this problem. In this paper, we propose an architecture for DW systems using a HDBMS as the middleware for data integration. This architecture uses a subset of Common Warehouse Metamodel (CWM) specification in order to provide more semantic to data integration according to a standard proposal. A case study presenting the use of the proposed architecture is also shown.
- Conference Article
2
- 10.1109/bibm.2011.30
- Nov 1, 2011
The unveiling of rules that govern drug-protein interactions is of paramount importance in drug discovery. To discover such relationships, we propose to use a novel method called DPA. Given a set of drug-protein interactions, DPA performs its tasks in several steps: (i) for each drug involved, its substructures are each converted into its fingerprints; (ii) for each protein involved, its protein domains are each converted into its fingerprints; (iii) a dependency measure between each drug substructure and protein domain is then computed based on the known interactions between the drugs and proteins, (iv) the dependency measures are then used to predict previously unknown drug-protein interactions. DPA has the advantage that it is able to perform its tasks effectively without requiring any 3D information about drug and protein structures. It makes use of molecular fingerprints which are information-rich and fast to compute. DPA has been tested with known drug-protein interaction data including enzymes, ion channels, protein-coupled receptors. Experimental results show that it can be very useful for predicting new drug-protein interaction as well as protein-ligand interactions. It can also be used to tackle problems such as ligand specificity thereby facilitating the drug discovery process.
- Research Article
120
- 10.1093/nar/gku337
- May 16, 2014
- Nucleic Acids Research
DINIES (drug–target interaction network inference engine based on supervised analysis) is a web server for predicting unknown drug–target interaction networks from various types of biological data (e.g. chemical structures, drug side effects, amino acid sequences and protein domains) in the framework of supervised network inference. The originality of DINIES lies in prediction with state-of-the-art machine learning methods, in the integration of heterogeneous biological data and in compatibility with the KEGG database. The DINIES server accepts any ‘profiles’ or precalculated similarity matrices (or ‘kernels’) of drugs and target proteins in tab-delimited file format. When a training data set is submitted to learn a predictive model, users can select either known interaction information in the KEGG DRUG database or their own interaction data. The user can also select an algorithm for supervised network inference, select various parameters in the method and specify weights for heterogeneous data integration. The server can provide integrative analyses with useful components in KEGG, such as biological pathways, functional hierarchy and human diseases. DINIES (http://www.genome.jp/tools/dinies/) is publicly available as one of the genome analysis tools in GenomeNet.
- Research Article
40
- 10.1186/s12967-019-1918-z
- May 22, 2019
- Journal of Translational Medicine
BackgroundPredicting adverse drug reactions (ADRs) has become very important owing to the huge global health burden and failure of drugs. This indicates a need for prior prediction of probable ADRs in preclinical stages which can improve drug failures and reduce the time and cost of development thus providing efficient and safer therapeutic options for patients. Though several approaches have been put forward for in silico ADR prediction, there is still room for improvement.MethodsIn the present work, we have used machine learning based approach for cardiovascular (CV) ADRs prediction by integrating different features of drugs, biological (drug transporters, targets and enzymes), chemical (substructure fingerprints) and phenotypic (therapeutic indications and other identified ADRs), and their two and three level combinations. To recognize quality and important features, we used minimum redundancy maximum relevance approach while synthetic minority over-sampling technique balancing method was used to introduce a balance in the training sets.ResultsThis is a rigorous and comprehensive study which involved the generation of a total of 504 computational models for 36 CV ADRs using two state-of-the-art machine-learning algorithms: random forest and sequential minimization optimization. All the models had an accuracy of around 90% and the biological and chemical features models were more informative as compared to the models generated using chemical features.ConclusionsThe results obtained demonstrated that the predictive models generated in the present study were highly accurate, and the phenotypic information of the drugs played the most important role in drug ADRs prediction. Furthermore, the results also showed that using the proposed method, different drugs properties can be combined to build computational predictive models which can effectively predict potential ADRs during early stages of drug development.
- Research Article
29
- 10.15252/embr.201642616
- May 19, 2016
- EMBO reports
Consumer reporting of adverse drug reactions: Systems that allow patients to report side effects of the drugs they are taking have yielded valuable information for improving drugs safety and health care.
- Research Article
182
- 10.2165/00002018-200022020-00007
- Jan 1, 2000
- Drug Safety
To implement a computer-based adverse drug reaction monitoring system and compare its results with those of stimulated spontaneous reporting, and to assess the excess lengths of stay and costs of patients with verified adverse drug reactions. A prospective cohort study was used to assess the efficacy of computer-based monitoring, and case-matching was used to assess excess length of stay and costs. This was a study of all patients admitted to a medical ward of a university hospital in Germany between June and December 1997. 379 patients were included, most of whom had infectious, gastrointestinal or liver diseases, or sleep apnoea syndrome. Patients admitted because of adverse drug reactions were excluded. All automatically generated laboratory signals and reports were evaluated by a team consisting of a clinical pharmacologist, a clinician and a pharmacist for their likelihood of being an adverse drug reaction. They were classified by severity and causality. For verified adverse drug reactions, control patients with similar primary diagnosis, age, gender and time of admission but without adverse drug reactions were matched to the cases in order to assess the excess length of hospitalisation caused by an adverse drug reaction. Adverse drug reactions were detected in 12% of patients by the computer-based monitoring system and stimulated spontaneous reporting together (46 adverse reactions in 45 patients) during 1718 treatment days. Computer-based monitoring identified adverse drug reactions in 34 cases, and stimulated spontaneous reporting in 17 cases. Only 5 adverse drug reactions were detected by both methods. The relative sensitivity of computer-based monitoring was 74% (relative specificity 75%), and that of stimulated spontaneous reporting was 37% (relative specificity 98%). All 3 serious adverse drug reactions were detected by computer-based monitoring, but only 2 out of the 3 were detected by stimulated spontaneous reporting. The percentage of automatically generated laboratory signals associated with an adverse drug reaction (positive predictive value) was 13%. The mean excess length of stay was 3.5 days per adverse drug reaction. 48% of adverse reactions were predictable and detected solely by computer-based monitoring. Therefore, the potential for savings on this ward from the introduction of computer-based monitoring can be calculated as EUR56 200/year ($US59 600/year) [ 1999 values]. Computer monitoring is an effective method for improving the detection of adverse drug reactions in inpatients. The excess length of stay and costs caused by adverse drug reactions are substantial and might be considerably reduced by earlier detection.
- Research Article
106
- 10.1021/ci400127q
- Oct 24, 2013
- Journal of Chemical Information and Modeling
The rapidly increasing amount of publicly available data in biology and chemistry enables researchers to revisit interaction problems by systematic integration and analysis of heterogeneous data. Herein, we developed a comprehensive python package to emphasize the integration of chemoinformatics and bioinformatics into a molecular informatics platform for drug discovery. PyDPI (drug-protein interaction with Python) is a powerful python toolkit for computing commonly used structural and physicochemical features of proteins and peptides from amino acid sequences, molecular descriptors of drug molecules from their topology, and protein-protein interaction and protein-ligand interaction descriptors. It computes 6 protein feature groups composed of 14 features that include 52 descriptor types and 9890 descriptors, 9 drug feature groups composed of 13 descriptor types that include 615 descriptors. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pair fingerprints, topological torsion fingerprints, and Morgan/circular fingerprints. By combining different types of descriptors from drugs and proteins in different ways, interaction descriptors representing protein-protein or drug-protein interactions could be conveniently generated. These computed descriptors can be widely used in various fields relevant to chemoinformatics, bioinformatics, and chemogenomics. PyDPI is freely available via https://sourceforge.net/projects/pydpicao/.
- Discussion
9
- 10.1016/j.eclinm.2019.11.009
- Nov 23, 2019
- EClinicalMedicine
Considering sex-specific adverse drug reactions should be a priority in pharmacovigilance and pharmacoepidemiological studies
- Conference Article
1
- 10.1109/bibm.2013.6732488
- Dec 1, 2013
It has been well recognized that adverse drug reactions (ADRs) are a significant cause of morbidity and mortality. There is a growing interest in investigating biological pathways involved in cellular response to drugs. Based on examining the co-occurrence of drugs in pathway activity and ADR profiles, in this paper, we propose a new method to explore the relationship between biological pathways and ADRs at a large scale. Using sparse canonical correlation analysis of 495 drugs with two profiles for 173 pathways and 1385 ADRs, a total of 80 correlated sets of pathways and ADRs were extracted. To evaluate the performance of our method, extracted correlated components were used to retrieve known ADR profiles from drug pathway profiles using a 5-fold cross validation. A relatively high prediction performance (AUC: 0.881) was achieved. This work provides a foundation for future investigation of ADRs in the context of biological pathways under different conditions.