Abstract

Serial analysis of gene expression (SAGE) is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation.

Highlights

  • Serial analysis of gene expression (SAGE) [1] is one of the most powerful, high-throughput tools available for global gene expression profiling at mRNA level

  • The output of SAGE-based analysis is the digital measurement of absolute RNA abundance levels, greatly facilitating direct and reliable comparison of expression profiles produced by different experiments and laboratories [3,4]

  • Such unique features have led to many important applications in a wide variety of studies, such as the discovery of potential transcriptional regulators and construction of biological networks [5], the identification of novel molecular tumour markers and therapeutic targets [6], the study of the molecular profile of gastroesophageal junction carcinomas [7] and the genomic analysis of mouse retinal development [8]

Read more

Summary

Background

Serial analysis of gene expression (SAGE) [1] is one of the most powerful, high-throughput tools available for global gene expression profiling at mRNA level. Based on the observation that genes exhibiting similar expression patterns are more likely to be co-regulated and share similar biological functions [32], clustering-based SAGE data analysis has found different applications, for example, the identification of biomarkers in human cancer research [33], the discovery of cell- specific promoters modules [34], and the better understanding of transcriptional networks [35,36]. Such applications mainly rely on the following traditional clustering techniques. It is important to recognize that, in order to obtain statistically-reliable and biologically-meaningful results, the application of both internal and external validation techniques is recommended for the assessment of clustering outcomes [30]

15. Akmaev VR
41. Kohonen T
Findings
50. Zuyderduyn SD
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call