Abstract

High throughput mRNA sample sequencing, known as RNA-seq, is as a powerful approach to detect differentially expressed genes starting from millions of short sequence reads. Although several workflows have been proposed to analyze RNA-seq data, the experiment quality control as a whole is not usually considered, thus potentially biasing the results and/or causing information lost. Experiment quality control refers to the analysis of the experiment as a whole, prior to any analysis. It not only inspects the presence of technical effects, but also if general biological assumptions are fulfilled. In this sense, multivariate approaches are crucial for this task.Here, a multivariate approach for quality control in RNA-seq experiments is proposed. This approach uses simple and yet effective well-known statistical methodologies. In particular, Principal Component Analysis was successfully applied over real data to detect and remove outlier samples. In addition, traditional multivariate exploration tools were applied in order to asses several controls that can help to ensure the results quality. Based on differential expression and functional enrichment analysis, here is demonstrated that the information retrieval is significantly enhanced through experiment quality control. Results show that the proposed multivariate approach increases the information obtained from RNA-seq data after outlier samples removal.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.