Abstract
Information fusion has become a powerful tool for challenging applications such as biological prediction problems. In this paper, we apply a new information-theoretical fusion technique to HIV-1 protease cleavage site prediction, which is a problem that has been in the focus of much interest and investigation of the machine learning community recently. It poses a difficult classification task due to its high dimensional feature space and a relatively small set of available training patterns. We also apply a new set of biophysical features to this problem and present experiments with neural networks, support vector machines, and decision trees. Application of our feature set results in high recognition rates and concise decision trees, producing manageable rule sets that can guide future experiments. In particular, we found a combination of neural networks and support vector machines to be beneficial for this problem.
Highlights
The fight against AIDS is one of the most prominent endeavors in current health programs
The proteins comprising the human immunodeficiency virus are produced in the form of long polyproteins, which must be cleaved in order to yield the active protein components of the mature virus
There is a notorious shortage of training and test data in many pattern recognition applications, including HIV-1 protease cleavage site detection
Summary
The fight against AIDS is one of the most prominent endeavors in current health programs. Researchers have the idea of preventing the chemical action of the protease by binding molecules, so-called HIV-1 protease inhibitor drugs, to its active site. An exhaustive search of this space is currently prohibitive; a fact that is not likely to change in the foreseeable future This situation calls for a computer-aided approach. The problem allows us to apply the whole plethora of techniques developed in the pattern recognition field, such as neural networks, etc. Classifier design and feature selection for HIV-1 protease cleavage site detection are hard problems still waiting for an efficient solution. We present a new set of features for HIV-1 protease cleavage site prediction, and experiment with classifier combination.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have