Abstract

High-dimensional compositional data, multivariate observations carrying relative information, frequently contain values below a detection limit (rounded zeros). We introduce new model-based procedures for replacing these values with reasonable numbers, so that the completed data set is ready for use with statistical analysis methods that rely on complete data, such as regression or classification with high-dimensional explanatory variables. The procedures respect the geometry of compositional data and can be considered as alternatives to existing methods. Simulations show that especially in high-dimensions, the proposed methods outperform existing methods. Moreover, even for a large number of rounded zeros, the new methods lead to an improved quality of the data, which is important for further analyses. The usefulness of the procedure is demonstrated using a data example from metabolomics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.