Abstract
Analysis of the large amounts of data accumulated in public databanks can facilitate a more comprehensive understanding of molecular biological processes. Community detection from molecular biological data is paramount in characterizing evolutionary and functional traits of organisms based on gene homology and co-expression, respectively. Although there are common tools to detect local communities from a large network, no toolkit exists for detecting communities that include an element of interest based on size sensitivity, i.e., functionality to obtain local communities with preferred sizes. Herein, we present the ConfeitoGUI toolkit for detecting local communities from a correlation network involving size sensitivity. We compared the toolkit with other common tools for detection in reconstructing communities of microarray experiments of mice. In the results, ConfeitoGUI was observed to be preferable for detecting communities whose sizes are similar to those of original communities compared to other common tools. By changing simple parameters representing sizes for the toolkit, a user can obtain local communities with preferred sizes, which is beneficial for further analysis of members belonging to the communities.
Highlights
In the era of big data, biologists encounter challenges in handling, processing, and moving such data obtained via high-throughput technologies [1]
We developed a standalone toolkit, ConfeitoGUI, to identify network modules within correlation networks in a size-sensitive manner by expanding the Confeito algorithm [6,22] and integrating vertex-vertex connections based on the algorithm
We compared ConfeitoGUI’s accuracy to that of other local community identification methods using network modules from a large mouse microarray dataset including results from 37,013 Affymetrix mouse microarray samples that was obtained from the Gene Expression Omnibus (GEO) of National Center for Biotechnology Information (NCBI) in April 2014 (S1 Table)
Summary
In the era of big data, biologists encounter challenges in handling, processing, and moving such data obtained via high-throughput technologies [1]. Clauset et al [2] suggested that approaches using network graphs are useful for social science, and for biochemistry and molecular biology. This was followed by various approaches to detect local communities from large networks such as those by Newman et al [3] and Blondel et al [4]. In these approaches, the modularity index is used to show the reasonability of local communities divided by their algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.