Abstract
The identification of drug target proteins (IDTP) plays a critical role in biometrics. The aim of this study was to retrieve potential drug target proteins (DTPs) from a collected protein dataset, which represents an overwhelming task of great significance. Previously reported methodologies for this task generally employ protein-protein interactive networks but neglect informative biochemical attributes. We formulated a novel framework utilizing biochemical attributes to address this problem. In the framework, a biased support vector machine (BSVM) was combined with the deep embedded representation extracted using a deep learning model, stacked auto-encoders (SAEs). In cases of non-drug target proteins (NDTPs) contaminated by DTPs, the framework is beneficial due to the efficient representation of the SAE and relief of the imbalance effect by the BSVM. The experimental results demonstrated the effectiveness of our framework, and the generalization capability was confirmed via comparisons to other models. This study is the first to exploit a deep learning model for IDTP. In summary, nearly 23% of the NDTPs were predicted as likely DTPs, which are awaiting further verification based on biomedical experiments.
Highlights
In the domain of drug development, the identification of drug target proteins (IDTP) is both significant and a challenge and has attracted much interest from pharmaceutical and biomedical researchers
The biased support vector machine (BSVM) was trained using a different set of 70% of the proteins acquired from the random stratified partition, and the remaining 30% of proteins were used for testing
We trained another BSVM with parameters selected from the same range as above according to the average F1 score in a 5-fold cross validation and using the same training set as the original representation in the iteration
Summary
In the domain of drug development, the identification of drug target proteins (IDTP) is both significant and a challenge and has attracted much interest from pharmaceutical and biomedical researchers. Proteins are crucial drug targets and have been widely studied, and human proteins have been for the identification of drug targets. Traditional procedures of drug target identification are limited by labour-intensive and time-consuming biomedical experiments [1,2], which tend to be performed within specific domains of research, leading to low efficiency and limited search scope. The low ratio of drug target proteins (DTPs) among human proteins aggravates such conditions, and failed results are commonly due to poorly planned experiments that lack fine analysis.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have