Two-dimensional MoS2-based impedimetric electronic tongue for the discrimination of endocrine disrupting chemicals using machine learning

Wania A Christinelli,Daniel S Correa,Luiz H.C Mattoso,Ricardo Cerri,Murilo H.M Facure,Osvaldo N Oliveira Jr,Flavio M Shimizu

doi:10.1016/j.snb.2021.129696

Abstract

• E-tongue was built for multi-target prediction of endocrine-disrupting chemicals. • XGBoost machine learning model has shown high accuracy. • Feature selection using information visualization technique enhanced the prediction accuracy. In this paper, we report on machine learning to analyze the capacitance spectra obtained with an electronic tongue (e-tongue) and discriminate three endocrine-disrupting chemicals (EDC): bisphenol A, estrone, and 17- β -estradiol, and their mixtures. The e-tongue comprised seven sensing units made with interdigitated gold electrodes coated with layer-by-layer films of poly( o -methoxy aniline), poly(3-thiophene acetic acid), and molybdenum disulfide (MoS 2 ). The Multilayer Perceptron (MLP), Random Forest, and Extreme Gradient Boosting (XGBoost) models were applied for multi-target regression to predict the concentration of individual contaminants and their mixtures. These machine learning models were evaluated according to the root mean square error (RMSE) values. The best performance was achieved with XGBoost for which RMSE ranged from 0.19 to 3.37 for individual contaminants, from 0.12 to 0.25 for the mixtures, and from 0.34 to 3.46 for the entire dataset. The high performance was only possible with a multi-target regression strategy, including a feature selection procedure. In the latter, the data were plotted with the parallel coordinate technique, and the silhouette coefficient was calculated, which is a quantitative measure of the ability to distinguish similar samples in a dataset. The usefulness of the machine learning methods is demonstrated by noting that the data from mixtures of EDCs could not be distinguished using multidimensional projections. Also significant is that this combination of machine learning and information visualization methodology is entirely generic; it may be applied to analyze data from e-tongues and other sensing and biosensing devices in prediction tasks as demanding as in the discrimination of mixtures of EDCs at concentrations below nmol L −1 .

Full Text