Abstract

Crop diseases are the most important biological hazards to challenge sustainable development in agricultural production for many years. Every year, 42% of the global agricultural yield is destroyed by disease. Bioinformatics techniques provide efficient methods with which to analyze and interpret the raw biological data, which helps to study the effect of a pathogen on a crop. Microarray gene expression data represent the expression levels of the genes of a cell (organism) maintained in a particular environment. Hence, significant gene prediction and pathogen–host interactions can be studied using gene expression data. Different machine learning techniques can be applied to extract useful information represented by the candidate genes. The approach proposed in this chapter consists of the preprocessing of gene expression data, gene selection or feature extraction using a parallel approach and classification. The feature selection methods have been analyzed for the extraction of candidate genes with biological significance for rice-related diseases; these are a support vector machine with recursive feature elimination (SVM-RFE), minimum redundancy maximum relevance (mRMR), principal component analysis (PCA), successive feature selection (SFS) and independent component analysis (ICA). In order to deal with computational complexity and the large volume of data, the combination of general-purpose graphics processing unit (GPGPU) computing and MapReduce programming on an Apache Hadoop framework is proposed. The experimental results show improved time efficiency in feature extraction and classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.