Abstract

BackgroundStudies that aim at explaining phenotypes or disease susceptibility by genetic or epigenetic variants often rely on clustering methods to stratify individuals or samples. While statistical associations may point at increased risk for certain parts of the population, the ultimate goal is to make precise predictions for each individual. This necessitates tools that allow for the rapid inspection of each data point, in particular to find explanations for outliers.ResultsACES is an integrative cluster- and phenotype-browser, which implements standard clustering methods, as well as multiple visualization methods in which all sample information can be displayed quickly. In addition, ACES can automatically mine a list of phenotypes for cluster enrichment, whereby the number of clusters and their boundaries are estimated by a novel method. For visual data browsing, ACES provides a 2D or 3D PCA or Heat Map view. ACES is implemented in Java, with a focus on a user-friendly, interactive, graphical interface.ConclusionsACES has been proven an invaluable tool for analyzing large, pre-filtered DNA methylation data sets and RNA-Sequencing data, due to its ease to link molecular markers to complex phenotypes. The source code is available from https://github.com/GrabherrGroup/ACES.

Highlights

  • ResultsACES is an integrative cluster- and phenotype-browser, which implements standard clustering methods, as well as multiple visualization methods in which all sample information can be displayed quickly

  • One fundamental challenge in modern biology and medicine is to divide samples into distinct categories, often cases and controls, based on the measurements of biomarkers in the wider sense [1]

  • It has been shown that DNA methylation (DNAm) modification of certain sites are directly linked to cancer [17]

Read more

Summary

Results

DNA methylation DNA methylation (DNAm) is an epigenetic mechanism that can control gene expression. Clusters found by hierarchical, k-means and DBSCAN algorithms are shown in each row. Hierarchical and k-means categorize them into different groups, while DBSCAN considers them as outliers, according to the parameters that are automatically computed by ACES. For Distance Matrix 3, shown in the bottom row, the three clustering algorithms generate different results, as there are no clear boundaries among groups. While k-means (Fig. 6a middle) groups the samples in two clusters, these two samples are classified as outliers in the DBSCAN clustering results based on the parameters that were automatically calculated by ACES (Fig. 6a right).

Conclusions
Introduction
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call