Abstract

The term 'Big Data' has recently attracted much attention in science. Working with big data sets can be both challenging and rewarding. The complexity and big data sets make the analysis difficult to deal with, and the increasing volume of data sets requires the development of new practical methods for their handling. In this contribution, we explored the efficient data reduction-multivariate curve resolution (EDR-MCR) strategy based on the convex hull theory for quantitative and qualitative analysis of large chemical data sets. For the quantitative example, the potential of the EDR-MCR method for selecting a representative calibration set was investigated, and the results were compared with the widely used Kennard–Stone (KS) algorithm. The EDR-MCR strategy strongly limits the number of calibration samples with a high potency of prediction performance. The priority of EDR-MCR over KS is its ability to find informative variables and eliminate redundant features.Moreover, the EDR-MCR strategy was also applied for the qualitative analysis of a large-scale metabolomic data set. The comparable analysis results of EDR-MCR with the region of interest (ROI) method confirmed the ability of this method for quantitative analysis of big mass spectrophotometer data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call