Abstract

BackgroundThe promise of modern personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease, on an individual patient basis. These functions are predominantly enabled by molecular signatures, which are computational models for predicting phenotypes and other responses of interest from high-throughput assay data. Data-analytics is a central component of molecular signature development and can jeopardize the entire process if conducted incorrectly. While exploratory data analysis may tolerate suboptimal protocols, clinical-grade molecular signatures are subject to vastly stricter requirements. Closing the gap between standards for exploratory versus clinically successful molecular signatures entails a thorough understanding of possible biases in the data analysis phase and developing strategies to avoid them.Methodology and Principal FindingsUsing a recently introduced data-analytic protocol as a case study, we provide an in-depth examination of the poorly studied biases of the data-analytic protocols related to signature multiplicity, biomarker redundancy, data preprocessing, and validation of signature reproducibility. The methodology and results presented in this work are aimed at expanding the understanding of these data-analytic biases that affect development of clinically robust molecular signatures.Conclusions and SignificanceSeveral recommendations follow from the current study. First, all molecular signatures of a phenotype should be extracted to the extent possible, in order to provide comprehensive and accurate grounds for understanding disease pathogenesis. Second, redundant genes should generally be removed from final signatures to facilitate reproducibility and decrease manufacturing costs. Third, data preprocessing procedures should be designed so as not to bias biomarker selection. Finally, molecular signatures developed and applied on different phenotypes and populations of patients should be treated with great caution.

Highlights

  • The promise of personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease on an individual patient basis. These functions are predominantly enabled by molecular signatures that are computational models for predicting phenotypes and other responses of interest from highthroughput assay data

  • The conclusions of the present study extend well beyond the development of gene expressionbased molecular signature of acute respiratory viral infections; the results readily generalize to other protocols, phenotypes, and assay platforms

  • A simulation study demonstrating data-analytic biases related to signature multiplicity and biomarker redundancy

Read more

Summary

Introduction

The promise of personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease on an individual patient basis. These functions are predominantly enabled by molecular signatures that are computational models for predicting phenotypes and other responses of interest from highthroughput assay data. The promise of modern personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease, on an individual patient basis These functions are predominantly enabled by molecular signatures, which are computational models for predicting phenotypes and other responses of interest from highthroughput assay data. While exploratory data analysis may tolerate suboptimal protocols, clinical-grade molecular signatures are subject to vastly stricter requirements. Closing the gap between standards for exploratory versus clinically successful molecular signatures entails a thorough understanding of possible biases in the data analysis phase and developing strategies to avoid them

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.