Abstract

In this study, we introduce a new nontargeted tile-based supervised analysis method that combines the four-grid tiling scheme previously established for the Fisher ratio (F-ratio) analysis (FRA) with the estimation of tile hit importance using the machine learning (ML) algorithm Random Forest (RF). This approach is termed tile-based RF analysis. As opposed to the standard tile-based F-ratio analysis, the RF approach can be extended to the analysis of unbalanced data sets, i.e., different numbers of samples per class. Tile-based RF computes out-of-bag (oob) tile hit importance estimates for every summed chromatographic signal within each tile on a per-mass channel basis (m/z). These estimates are then used to rank tile hits in a descending order of importance. In the present investigation, the RF approach was applied for a two-class comparison of stool samples collected from omnivore (O) subjects and stored using two different storage conditions: liquid (Liq) and lyophilized (Lyo). Two final hit lists were generated using balanced (8 vs Eight comparison) and unbalanced (8 vs Nine comparison) data sets and compared to the hit list generated by the standard F-ratio analysis. Similar class-distinguishing analytes (p < 0.01) were discovered by both methods. However, while the FRA discovered a more comprehensive hit list (65 hits), the RF approach strictly discovered hits (31 hits for the balanced data set comparison and 29 hits for the unbalanced data set comparison) with concentration ratios, [OLiq]/[OLyo], greater than 2 (or less than 0.5). This difference is attributed to the more stringent feature selection process used by the RF algorithm. Moreover, our findings suggest that the RF approach is a promising method for identifying class-distinguishing analytes in settings characterized by both high between-class variance and high within-class variance, making it an advantageous method in the study of complex biological matrices.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.