Abstract

During the past two decades, the field of human genetics has experienced an information explosion. The completion of the human genome project and the development of high throughput SNP technologies have created a wealth of data; however, the analysis and interpretation of these data have created a research bottleneck. While technology facilitates the measurement of hundreds or thousands of genes, statistical and computational methodologies are lacking for the analysis of these data. New statistical methods and variable selection strategies must be explored for identifying disease susceptibility genes for common, complex diseases. Neural networks (NN) are a class of pattern recognition methods that have been successfully implemented for data mining and prediction in a variety of fields. The application of NN for statistical genetics studies is an active area of research. Neural networks have been applied in both linkage and association analysis for the identification of disease susceptibility genes.In the current review, we consider how NN have been used for both linkage and association analyses in genetic epidemiology. We discuss both the successes of these initial NN applications, and the questions that arose during the previous studies. Finally, we introduce evolutionary computing strategies, Genetic Programming Neural Networks (GPNN) and Grammatical Evolution Neural Networks (GENN), for using NN in association studies of complex human diseases that address some of the caveats illuminated by previous work.

Highlights

  • The identification of disease susceptibility genes for complex, multifactorial disease is arguably the most difficult challenge facing human geneticists today [1]

  • We have reviewed traditional back-propagation Neural networks (NN) and their previous applications in genetic epidemiology for linkage and association studies

  • We have limited our discussion to back-propagation NN because they are the type of NN most commonly used in genetic epidemiology

Read more

Summary

Introduction

The identification of disease susceptibility genes for complex, multifactorial disease is arguably the most difficult challenge facing human geneticists today [1]. They used fully connected feedforward NN architecture with one input layer, one hidden layer, and one output layer representing affection status They simulated multiple data types – including SNP variables along with quantitative and qualitative environmental traits. Many of these approaches use a prediction error fitness measure, such that they select an architecture based on its generalization to new observations [46], while others use a classification error, or training error [43] These methods are used to attempt to get the most learning out of the network, while trying to avoid over-fitting the data [43,45]. One potential solution to the architecture selection problem in NN is to evolve the NN architecture for each data set analyzed using an evolutionary computation approach This will allow the user to avoid common pitfalls associated with having the wrong network architecture. As the field has approached genome-wide association scans, it has become crucial that methods detect associations in the presence of thousands of genetic variables

Conclusion
Moore JH
10. Bellman R
12. Moore JH
14. Skapuro D: Building neural networks New York
17. Anderson J
26. Curtis D
Findings
30. Falk CT
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.