Identifying inorganic material affinity classes for peptide sequences based on context learning

Guangxu Xun,Tiffany R Walsh,Xiaoyi Li Xiaoyi Li,Marc R Knecht,Paras N Prasad,Aidong Zhang,Mark T Swihart

doi:10.1109/bibm.2015.7359742

Abstract

There is a growing interest in identifying inorganic material affinity classes for peptide sequences due to the development of bionanotechnology and its wide applications. In particular, a selective model capable of learning cross-material affinity patterns can help us design peptide sequences with desired binding selectivity for one inorganic material over another. However, as a newly emerging topic, there are several distinct challenges of it that limit the performance of many existing peptide sequence classification algorithms. In this paper, we propose a novel framework to identify affinity classes for peptide sequences across inorganic materials. After enlarging our dataset by simulating peptide sequences, we use a context learning based method to obtain the vector representation of each amino acid and each peptide sequence. By analyzing the structure and affinity class of each peptide sequence, we are able to capture the semantics of amino acids and peptide sequences in a vector space. At the last step we train our classifier based on these vector features and the heuristic rules. The construction of our models gives us the potential to overcome the challenges of this task and the empirical results show the effectiveness of our models.

Full Text