Abstract

Two variable selection methods were evaluated by comparing their predictions with respect to differentiating among environmental soil samples. The focus of this work is to determine which input variables are most relevant for prediction of soil sources using discriminant partial least square (D-PLS) and back-propagation artificial neural network (BP-ANN) models. The methods investigated were stepwise variable selection method and genetic algorithms (GAs). Microbial community DNA was extracted from 48 environmental soil samples derived from different field crops and soil sources. After amplification of bacterial ribosomal RNA genes by polymerase chain reaction (PCR), the products were separated by gel electrophoresis. Characteristic complex band patterns were obtained, indicating high bacterial diversity. Two hundred and twenty-three DNA band patterns produced in the gels of the soil samples were used in the analysis, after removal of included DNA standard markers. Based on the brightness of the bands, densitometric curves of the selected DNA band pattern were extracted from the gel images. The curves were smoothed using Savitsky–Golay method and scaled to the DNA standard markers. The prediction results based on the two variable selection methods for PLS and ANN models are presented and compared. Both methods gave good results before any variable selection methods, with the ANN being better than D-PLS. The prediction performance of both methods specially the D-PLS were improved by applying the stepwise variable selection and the GA variable selection method. The study also shows that GA variable selection had a significant improvement of the predictive ability than the stepwise variable selection method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call