Abstract

Apolipoprotein is a group of plasma proteins that are associated with a variety of diseases, such as hyperlipidemia, atherosclerosis, Alzheimer’s disease, and diabetes. In order to investigate the function of apolipoproteins and to develop effective targets for related diseases, it is necessary to accurately identify and classify apolipoproteins. Although it is possible to identify apolipoproteins accurately through biochemical experiments, they are expensive and time-consuming. This work aims to establish a high-efficiency and high-accuracy prediction model for recognition of apolipoproteins and their subfamilies. We firstly constructed a high-quality benchmark dataset including 270 apolipoproteins and 535 non-apolipoproteins. Based on the dataset, pseudo-amino acid composition (PseAAC) and composition of k-spaced amino acid pairs (CKSAAP) were used as input vectors. To improve the prediction accuracy and eliminate redundant information, analysis of variance (ANOVA) was used to rank the features. And the incremental feature selection was utilized to obtain the best feature subset. Support vector machine (SVM) was proposed to construct the classification model, which could produce the accuracy of 97.27%, sensitivity of 96.30%, and specificity of 97.76% for discriminating apolipoprotein from non-apolipoprotein in 10-fold cross-validation. In addition, the same process was repeated to generate a new model for predicting apolipoprotein subfamilies. The new model could achieve an overall accuracy of 95.93% in 10-fold cross-validation. According to our proposed model, a convenient webserver called ApoPred was established, which can be freely accessed at http://tang-biolab.com/server/ApoPred/service.html. We expect that this work will contribute to apolipoprotein function research and drug development in relevant diseases.

Highlights

  • Apolipoprotein (Apo), a protein component of plasma lipoprotein, can bind and transport blood lipids to various tissues of the body for metabolism and utilization

  • A large number of studies have found that apolipoprotein gene mutation, the formation of different allelic polymorphisms, and further the generation of different phenotypes of apolipoprotein, can affect the metabolism and utilization of blood lipid, thereby triggering the occurrence and development of hyperlipidemia, atherosclerosis, cardiovascular and cerebrovascular diseases (Richardson et al, 2020)

  • The occurrence of hyperlipidemia and atherosclerosis is often accompanied by abnormal expression of high-density lipoprotein (HDL) and ApoA-I

Read more

Summary

Introduction

Apolipoprotein (Apo), a protein component of plasma lipoprotein, can bind and transport blood lipids to various tissues of the body for metabolism and utilization. It is mainly synthesized in the liver and partly in the small intestine (Yiu et al, 2020). A large number of studies have found that apolipoprotein gene mutation, the formation of different allelic polymorphisms, and further the generation of different phenotypes of apolipoprotein, can affect the metabolism and utilization of blood lipid, thereby triggering the occurrence and development of hyperlipidemia, atherosclerosis, cardiovascular and cerebrovascular diseases (Richardson et al, 2020). Studies suggested that ApoM takes a role in the antiatherogenic function of HDL through multiple pathways such as lipid metabolism, immune regulation, and anti-inflammatory effect (Arkensteijn et al, 2013). Correctly identify apolipoproteins and their subfamilies could provide important clues for understanding their function and roles in various of diseases

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call