Abstract

Metabolomic data normality is vital for many statistical analyses to identify significantly different metabolic features. However, despite the thousands of metabolomic publications every year, the study of metabolomic data distribution is rare. Using large-scale metabolomic data sets, we performed a comprehensive study of metabolomic data distributions. We showcased that metabolic features have diverse data distribution types, and the majority of them cannot be normalized correctly using conventional data transformation algorithms, including log and square root transformations. To understand the various non-normal data distributions, we proposed fitting metabolomic data into nine beta distributions, each representing a unique data distribution. The results of three large-scale data sets consistently show that two low normality types are very common. Next, we created the adaptive Box-Cox (ABC) transformation, a novel feature-specific data transformation approach for improving data normality. By tuning a power parameter based on a normality test result, ABC transformation was made to work for various data distribution types, and it showed great performance in normalizing skewed metabolomic data. Tested on a series of simulated data in Monte Carlo simulations, ABC transformation outperformed conventional data transformation approaches for both positively and negatively skewed data distributions. ABC transformation was further demonstrated in a real metabolomic study composed of three pairwise comparisons. Additional 84, 44, and 57 significant metabolites were newly confirmed after ABC transformation, corresponding to respective increases of 70.6, 13.4, and 22.9% in significant metabolites compared to the conventional metabolomic workflow. Some of these newly discovered metabolites showed promising biological meanings. ABC transformation was implemented in the R package ABCstats and is freely available on GitHub (https://github.com/HuanLab/ABCstats).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.