Abstract

Multiple analytical platforms are frequently used in metabolomics studies. The resulting multiple data blocks contain, in general, similar parts of information which can be disclosed by chemometric methods. The metabolites of interest, however, are usually just a minor part of the complete data block and are related to a response of interest such as quality traits. Concatenation of data matrices is frequently used to simultaneously analyze multiple data blocks. Two main problems may occur with this approach: 1) the number of variables becomes very large in relation to the number of observations which may deteriorate model performance, and 2) scaling issues between the data blocks need to be resolved. Therefore, a method is proposed that circumvents direct concatenation of two data matrices but does uncover the shared and distinct parts of the data sets in relation to quality traits. The relevant part of the data blocks with respect to the quality trait of interest is revealed by partial least squares regression on each of the data blocks. The score vectors of both models that are predictive for the quality trait are then used in a canonical correlation analysis. Highly correlating score vectors indicate parts of the data blocks that are closely related. By inspecting the relevant loading vectors, the metabolites of interest are revealed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.