Abstract

BackgroundAlzheimer’s disease (AD) imposes a heavy burden on society and every family. Therefore, diagnosing AD in advance and discovering new drug targets are crucial, while these could be achieved by identifying AD-related proteins. The time-consuming and money-costing biological experiment makes researchers turn to develop more advanced algorithms to identify AD-related proteins.ResultsFirstly, we proposed a hypothesis “similar diseases share similar related proteins”. Therefore, five similarity calculation methods are introduced to find out others diseases which are similar to AD. Then, these diseases’ related proteins could be obtained by public data set. Finally, these proteins are features of each disease and could be used to map their similarity to AD. We developed a novel method ‘LRRGD’ which combines Logistic Regression (LR) and Gradient Descent (GD) and borrows the idea of Random Forest (RF). LR is introduced to regress features to similarities. Borrowing the idea of RF, hundreds of LR models have been built by randomly selecting 40 features (proteins) each time. Here, GD is introduced to find out the optimal result. To avoid the drawback of local optimal solution, a good initial value is selected by some known AD-related proteins. Finally, 376 proteins are found to be related to AD.ConclusionThree hundred eight of three hundred seventy-six proteins are the novel proteins. Three case studies are done to prove our method’s effectiveness. These 308 proteins could give researchers a basis to do biological experiments to help treatment and diagnostic AD.

Highlights

  • Alzheimer’s disease (AD) imposes a heavy burden on society and every family

  • This is because the main pathological feature of AD patients is that a large number of beta amyloid (A beta) deposits are formed outside the neurons in the cortex and hippocampus and

  • UniProt Non-redundant Reference (UniRef) database, which combines closely related protein sequences into a single record to improve search speed; currently, three sub-libraries are formed according to sequence similarity, namely UniRef100, UniRef90 and UniRef50; UniProt Archive (UniParc) is a repository that records the history of all protein sequences

Read more

Summary

Introduction

Alzheimer’s disease (AD) imposes a heavy burden on society and every family. diagnosing AD in advance and discovering new drug targets are crucial, while these could be achieved by identifying ADrelated proteins. Many scholars reported that abnormal behavior of specific proteins is the key to cause AD [4, 5] This is because the main pathological feature of AD patients is that a large number of beta amyloid (A beta) deposits are formed outside the neurons in the cortex and hippocampus and Recently, finding alternatives to diagnosing AD has become a hot issue [8]. The Human Discovery Multi-Analyte Profile (MAP) has become a popular tool to identify plasma analytes These exciting results raise a major issue that it is hard to reproduce these protein panels [8]. Olsson B et al [10] confirmed this view, and they found that the NFL was increasing in both AD patients and MCI’s CSF Studies have found this phenomenon in serum and plasma samples as well [11]. O’Bryant et al [12] used a serum-based algorithm to distinguish AD from Parkinson’s disease and cross-validated this

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call