Abstract

Expression profiling is a popular tool for studying gene expression levels, but libraries’ origins and data quality are often poorly annotated or contain errors. Experimental techniques, library annotations and analysis algorithms vary between laboratories and may contain errors. Traditional analysis methods, including research into tissuespecific expression, assume expression levels to be correct and libraries to be correctly annotated, which is not always the case. Therefore, tools capable of assessing the quality of multiple types of expression data using the data alone would be invaluable for quality control of that data and elucidation of its suitability for expression analysis. Here we compare and review over 20 methods and focus on a number of key developments in the field. We also highlight the application of recently devised novel quality control methods and show examples of applications of the newly developed quality control expression matrixes (QCEM) to the analysis and quality control of SAGE data. The described example include elucidating the correct tissue identity and show that disease state for expression libraries created using a range of expression profiling methods might be easily elucidated. The described novel quality control methods address key shortcomings of the previously reported tools and provide a universal quality control method for multiple types of expression data.

Highlights

  • MRNA expression profiling is well established and a number of techniques are employed for acquiring and analysis of data on a sample’s transcriptome or for studying differential gene expression

  • While reverse transcription, which is carried out to prepare the sample for amplification, in theory should result in one cDNA molecule for each original mRNA molecule, in practice some mRNA may not undergo the full set of reactions, introducing a bias into the sample in favour of the fully converted cDNAs [7]

  • Expression profiling algorithms were previously found to contain errors, correction of which would ensure the results from investigations into differential gene expression are no longer affected by such problems

Read more

Summary

Introduction

MRNA expression profiling is well established and a number of techniques are employed for acquiring and analysis of data on a sample’s transcriptome or for studying differential gene expression. The abundance of the data and the absence of identifiable data quality indicators require stringent quality control methods, e.g. for confirming correct annotation, or the identity of each library independently of the annotation and the quality of the underlying data itself such as bulk non-normalised preparations, or the methods used Such tools would help to resolve many errors in expression data annotations such as tissue of origin, disease state or protocol used to prepare the library, because even a trivial error in for instance library origin might completely invalidate data selection for the analysis and the results obtained. RNA-Seq, which uses generation sequencing, provides a much improved depth of sequencing, but artefacts can still be introduced into the results, requiring quality control [10]

Objectives
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.