Abstract
The development of routine analyses to allow for the handling of large amounts of samples and to avoid cost and time expensive analytical techniques is of high value. These routine analyses most often require calibration using the detailed analyses as reference values. A representative subset reflecting the complete range of the variables of interest is required for this purpose. In this paper this subset selection problem is tackled for multi-experiment data sets. Conventional techniques such as the Kennard and Stone algorithm and OptiSim are compared to a new approach based on Genetic Algorithms. The challenge here is to find an adequate objective function and to modify the standard crossover and mutation operators to keep the number of desired samples fixed. These techniques are applied on a data set containing the concentration of 45 fatty acids, determined by a simplified reference method, in 1033 milk samples, stemming from six different experiments. The objective is to select a subset of 100 samples in which each of the six different experiments is sufficiently represented. While there is no obvious way to generalize the conventional methods for multi-experiment data sets, this can quite easily be accomplished for Genetic Algorithms by modifying the objective function. Our results indicate that Genetic Algorithms are very capable of handling the subset selection problem for multi-experiment data sets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.