Nearest Neighbor Networks: clustering expression data based on gene neighborhoods.

Curtis Huttenhower,Nathan O Siemers,Sauhard Sahi,Avi I Flamholz,Chad L Myers,Kellen L Olszewski,Matthew A Hibbs,Jessica N Landis,Hilary A Coller,Olga G Troyanskaya

doi:10.1186/1471-2105-8-250

Curtis Huttenhower, Nathan O Siemers + Show 8 more

Open Access

https://doi.org/10.1186/1471-2105-8-250

Copy DOI

Abstract

BackgroundThe availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes).ResultsWe developed Nearest Neighbor Networks (NNN), a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods.ConclusionThe Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the analysis of large datasets, and its ability to span a wide range of biological functions with high precision.

Highlights

Introduction to the Theory of ComputationCourse Technology; 2005.41
We report below a clustering algorithm based on shared nearest neighbors called Nearest Neighbor Networks (NNN) intended to serve as a useful tool for biologists when discovering functional activity in coexpression data sets
NNN succeeds in producing small, precise clusters from coexpression data, and these clusters generally span a wider variety of biological processes than those produced by the other clustering algorithms evaluated

Summary

Results

We developed Nearest Neighbor Networks (NNN), a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. NNN was successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods

Conclusion

Background

Results and discussion

12. MacQueen JB

16. Quackenbush J

27. Tarjan RE

40. Sipser M

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Jul 12, 2007
Citations: 93	License type: cc-by

R Discovery Prime

R Discovery Prime

Nearest Neighbor Networks: clustering expression data based on gene neighborhoods.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

PARSIMONY AND THE CHOICE BETWEEN DIFFERENT TRANSFORMATIONS FOR THE SAME CHARACTER SET.
M.F Mickevich ... Diana Lipscomb
Cladistics : the international journal of the Willi Hennig Society | VOL. 7
M.F Mickevich, et. al.M.F Mickevich ... Diana Lipscomb
01 Jun 1991
Cladistics : the international journal of the Willi Hennig Society | VOL. 7

Synchronization investigation of the network group constituted by the nearest neighbor networks under inner and outer synchronous couplings**Project supported by the National Natural Science Foundation of China (Grant No. 11004092), the Natural Science Foundation of Liaoning Province, China (Grant Nos. 2015020079 and 201602455), and the Foundation of Education Department of Liaoning Province, China (Grant No. L201683665)
Ting-Ting Li ... Fei Han
Chinese Physics B | VOL. 25
Ting-Ting Li, et. al.Ting-Ting Li ... Fei Han
25 Oct 2016
Chinese Physics B | VOL. 25

Automatic data distribution for nearest neighbor networks
M Philippsen
-
M PhilippsenM Philippsen
19 Oct 1992
19 Oct 1992

A fuzzy nearest neighbor neural network statistical model for predicting demand for natural gas and energy cost savings in public buildings
James A Rodger
Expert systems with applications | VOL. 41
James A RodgerJames A Rodger
31 Aug 2013
Expert systems with applications | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Nearest Neighbor Networks: clustering expression data based on gene neighborhoods.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics