Depth normalization of small RNA sequencing: using data and biology to select a suitable method.

Yannick Düren,Li-Xuan Qin,Johannes Lederer

doi:10.1093/nar/gkac064

Yannick Düren, Li-Xuan Qin + Show 1 more

Open Access

https://doi.org/10.1093/nar/gkac064

Copy DOI

Abstract

Deep sequencing has become one of the most popular tools for transcriptome profiling in biomedical studies. While an abundance of computational methods exists for ‘normalizing’ sequencing data to remove unwanted between-sample variations due to experimental handling, there is no consensus on which normalization is the most suitable for a given data set. To address this problem, we developed ‘DANA’—an approach for assessing the performance of normalization methods for microRNA sequencing data based on biology-motivated and data-driven metrics. Our approach takes advantage of well-known biological features of microRNAs for their expression pattern and chromosomal clustering to simultaneously assess (i) how effectively normalization removes handling artifacts and (ii) how aptly normalization preserves biological signals. With DANA, we confirm that the performance of eight commonly used normalization methods vary widely across different data sets and provide guidance for selecting a suitable method for the data at hand. Hence, it should be adopted as a routine preprocessing step (preceding normalization) for microRNA sequencing data analysis. DANA is implemented in R and publicly available at https://github.com/LXQin/DANA.

Full Text