Abstract

It is widely accepted that feature selection is an essential step in predictive modeling. There are several approaches to feature selection, from filter techniques to meta-heuristics wrapper methods. In this paper, we propose a compilation of tools to optimize the fitting of black-box linear models. The proposed AnTSbe algorithm combines Ant Colony Optimization and Tabu Search memory list for the selection of features and uses l1 and l2 regularization norms to fit the linear models. In addition, a polynomial combination of input features was introduced to further explore the information contained in the original data. As a case study, excitation-emission matrix fluorescence data were used as the primary measurements to predict total sulfur concentration in diesel fuel samples. The sample dataset was divided into S10 (less than 10 ​ppm of total sulfur), and S100 (mean sulfur content of 100 ​ppm) groups and local linear models were fit with AnTSbe. For the Diesel S100 local models, using only 5 out of the original 1467 fluorescence pairs, combined with bases expansion, we were able to satisfactorily predict total sulfur content in samples with MAPE of less than 4% and RMSE of 4.68 ​ppm, for the test subset. For the Diesel S10 local models, the use of 4 Ex/Em pairs was sufficient to predict sulfur content with MAPE 0.24%, and RMSE of 0.015 ​ppm, for the test subset. Our experimental results demonstrate that the proposed methodology was able to satisfactorily optimize the fitting of linear models to predict sulfur content in diesel fuel samples without need of chemical of physical pre-treatment, and was superior to classic PLS regression methods and also to our previous results with ant colony optimization studies in the same dataset. The proposed AnTSbe can be directly applied to data from other sources without need for adaptations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.