Neural networks (NNs) are emerging as a rapid and scalable method for quantifying metabolites directly from nuclear magnetic resonance (NMR) spectra, but the nonlinear nature of NNs precludes understanding of how a model makes predictions. This study implements an explainable artificial intelligence algorithm called integrated gradients (IG) to elucidate which regions of input spectra are the most important for the quantification of specific analytes. The approach is first validated in simulated mixture spectra of eight aqueous metabolites and then investigated in experimentally acquired lipid spectra of a reference standard mixture and a murine hepatic extract. The IG method revealed that, like a human spectroscopist, NNs recognize and quantify analytes based on an analyte's respective resonance line-shapes, amplitudes, and frequencies. NNs can compensate for peak overlap and prioritize specific resonances most important for concentration determination. Further, we show how modifying a NN training dataset can affect how a model makes decisions, and we provide examples of how this approach can be used to de-bug issues with model performance. Overall, results show that the IG technique facilitates a visual and quantitative understanding of how model inputs relate to model outputs, potentially making NNs a more attractive option for targeted and automated NMR-based metabolomics.
Read full abstract