Abstract
AbstractRepresentative samples are important for multivariate calibration. The highly efficient selection of representative samples to be labelled can save money and time. Existing methods, such as Kennard‐Stone and net analyte signal selection, are usually based on the distance between candidate samples and labelled calibration sets in feature space. However, these distances are influenced by the feature space, which is spanned by an information vector extracted from labelled samples. To overcome the negative effects of the distance‐based selection method, a model performance enhancement‐based sample selection method is proposed to select calibration samples efficiently. Based on loss function optimization, the samples that can improve model performance the most, as estimated by bootstrap, are sequentially selected and added to the calibration set. Due to the high representation of each sample, a few samples can build a model that has no significant loss of prediction ability when compared with a model built with the large number set of calibration samples. The performance enhancement‐based active learning (PEAL) sample selection method is both effective and efficient.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.