Abstract

The paper presents a new methodology within the framework of the so-called compliant class-models, PLS2-CM, designed with the purpose of improving the performance of class-modelling in a setting with more than two classes. The improvement in the class-models is achieved through the use of multi-response PLS models with the classes encoded via Error-Correcting Output Codes (ECOC), instead of the traditional class indicator variables used in chemometrics.The proposed PLS2-CM entails a decomposition of a class-modelling problem into a series of binary learners, based on a family of code matrices with different code length, which are evaluated to obtain simultaneous compliant class-models with the best performance.The methodology develops both a new encoding system, based on multi-criteria optimization to search for optimal coding matrices, and a new decoding system, based on probability thresholds to assign objects to class-models. The whole procedure implies that the characteristics of the dataset at hand affect the final selection of the coding matrix and therefore of built class-models, thus giving rise to a data-driven strategy.The application of PLS2-CM to a variety of cases (controlled data, experimental data and repository datasets) results in an enhanced class-modelling performance by means of the suggested procedure, as measured by the DMCEN (Diagonal Modified Confusion Entropy) index and by sensitivity-specificity matrices. The predictive ability of the compliant class-models has been evaluated.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call