DUPLEX Algorithm Research Articles

The effects of the number of seeds in a training sample set on the ability to predict the viability of cabbage or radish seeds are presented and discussed. The supervised classification method extended canonical variates analysis (ECVA) was used to develop a classification model. Calibration sub-sets of different sizes were chosen randomly with several iterations and using the spectral-based sample selection algorithms DUPLEX and CADEX. An independent test set was used to validate the developed classification models. The results showed that 200 seeds were optimal in a calibration set for both cabbage and radish data. The misclassification rates at optimal sample size were 8%, 6% and 7% for cabbage and 3%, 3% and 2% for radish respectively for random method (averaged for 10 iterations), DUPLEX and CADEX algorithms. This was similar to the misclassification rate of 6% and 2% for cabbage and radish obtained using all 600 seeds in the calibration set. Thus, the number of seeds in the calibration set can be reduced by up to 67% without significant loss of classification accuracy, which will effectively enhance the cost-effectiveness of NIR spectral analysis. Wavelength regions important for the discrimination between viable and non-viable seeds were identified using interval ECVA (iECVA) models, ECVA weight plots and the mean difference spectrum for viable and non-viable seeds.

Biofilms are complex aggregates formed by microorganisms such as bacteria, fungi and algae, which grow at the interfaces between water and natural or artificial materials. They are actively involved in processes of sorption and desorption of metal ions in water and reflect the environmental conditions in the recent past. Therefore, biofilms can be used as bioindicators of water quality. The goal of this study was to determine whether the biofilms, developed in different aquatic systems, could be successfully discriminated using data on their elemental compositions. Biofilms were grown on natural or polycarbonate materials in flowing water, standing water and seawater bodies. Using an unsupervised technique such as principal component analysis (PCA) and several supervised methods like classification and regression trees (CART), discriminant partial least squares regression (DPLS) and uninformative variable elimination–DPLS (UVE-DPLS), we could confirm the uniqueness of sea biofilms and make a distinction between flowing water and standing water biofilms. The CART, DPLS and UVE-DPLS discriminant models were validated with an independent test set selected either by the Kennard and Stone method or the duplex algorithm. The best model was obtained from CART with 100% correct classification rate for the test set designed by the Kennard and Stone algorithm. With CART, one variable describing the Mg content in the biofilm water phase was found to be important for the discrimination of flowing water and standing water biofilms.

DUPLEX Algorithm Research Articles

Articles published on DUPLEX Algorithm

Optimal Sample Size for Predicting Viability of Cabbage and Radish Seeds Based on near Infrared Spectra of Single Seeds

On the selection of samples for multivariate regression analysis: application to near-infrared (NIR) calibration models for the prediction of pulp yield in Eucalyptus nitens

Discrimination of biofilm samples using pattern recognition techniques

Prediction of ozone tropospheric degradation rate constants by projection pursuit regression

Genetic algorithm for controllers in elevator groups: analysis and simulation during lunchpeak traffic

Russian paint output drops by more than 10%

Cross-Validation and Information Criteria in Causal Modeling

Validation of Regression Models: Methods and Examples

Validation of Regression Models: Methods and Examples

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

DUPLEX Algorithm Research Articles

Articles published on DUPLEX Algorithm

Optimal Sample Size for Predicting Viability of Cabbage and Radish Seeds Based on near Infrared Spectra of Single Seeds

On the selection of samples for multivariate regression analysis: application to near-infrared (NIR) calibration models for the prediction of pulp yield in Eucalyptus nitens

Discrimination of biofilm samples using pattern recognition techniques

Prediction of ozone tropospheric degradation rate constants by projection pursuit regression

Genetic algorithm for controllers in elevator groups: analysis and simulation during lunchpeak traffic

Russian paint output drops by more than 10%

Cross-Validation and Information Criteria in Causal Modeling

Validation of Regression Models: Methods and Examples

Validation of Regression Models: Methods and Examples