Abstract
Variable importance indices relying on the outputs of parametric techniques (e.g. Partial Least Squares - PLS) have been hailed an efficient course of action for variable selection in wrapper-based frameworks. The use of parametric techniques for that purpose, however, may lead to unreliable rankings when the assessed variables do not follow a parametric probability density function, jeopardizing the precision of variable importance assessment. This paper presents a new framework for variable selection that relies on non-parametric statistical tests with the aim of classifying industrial batches or samples into multiple classes related to quality or authenticity. The framework relies on two phases. In the first phase (i.e. filter), the Mutual Information (MI) technique performs a preliminary removal of less significant variables. In the second phase (i.e. wrapper), three non-parametric tests (Anderson-Darling, Kruskal-Wallis and Steel’s Test) are used to rank the remaining variables according to their relevance for classification. The robustness of the proposed framework is evaluated by varying the MI cutoff and different types of classifiers in data collected from seven industrial processes. On average, the recommended combination of MI cutoff, non-parametric test and classifier for each dataset increased classification accuracy by 17.04% while requiring 78.65% less variables when compared to the well-known stepwise variable selection method. The proposed framework also outperformed other variable selection approaches from the literature.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.