Abstract

Data resulting from high‐throughput biological experiments are frequently of relative nature. This implies that the most relevant information is on the shape of the data distribution over the biological features more than on the size of the measurements themselves. One well‐established way to acknowledge this in statistical processing is through logratio analysis. In the current work, we introduce selective pivot logratio coordinates as a new type of orthonormal logratio coordinate representation for high‐dimensional relative (a.k.a. compositional) data. This proposal is aimed to enhance the identification of biomarkers in the context of binary classification problems, which is a common setting of scientific studies in the field. These logratio coordinates are constructed so that the pivot coordinate representing a certain compositional part aggregates all pairwise logratios of that part to the rest but, unlike in the ordinary formulation, excludes those deviating from the main pattern. This novel coordinate system is embedded within a partial least squares discriminant analysis (PLS‐DA) model for its practical application. Based on both synthetic and real‐world metabolomic data sets, we demonstrate the enhanced performance of the novel approach when compared with other methods used in the area.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.