Abstract Genome-wide coexpression analysis of bulk human brain samples is a powerful approach for identifying highly reproducible transcriptional signatures of cell types and cell states, since it can survey huge numbers of individuals, cells, and transcripts. However, it is difficult to optimize gene coexpression network construction, compare gene coexpression modules from independent datasets, or even locate gene expression datasets with similar attributes. To address these challenges, we have developed the OMICON platform (theomicon.ucsf.edu) to promote FAIR (Findable, Accessible, Interoperable, Reusable) research practices involving human brain gene coexpression networks. OMICON contains structured gene expression data for >17K bulk human brain samples (~10K normal and ~7K malignant glioma), which were collected from diverse public repositories and consortia (e.g., GEO, TCGA, CGGA, CPTAC, REMBRANDT, IvyGAP, GTEx, NABEC, the Allen Institute). To promote interoperability, we have standardized metadata for hundreds of dataset and sample attributes, including single-nucleotide and copy-number mutations. For each dataset, we have used the FindModules R function to construct dozens of gene coexpression networks by iterating over module detection parameters. These efforts have identified ~100K gene coexpression modules, which have been extensively characterized via enrichment analysis with thousands of curated gene sets. All datasets, metadata, gene coexpression networks, enrichment results, and analysis steps can be browsed with an interactive workflow visualization tool, which promotes accessibility, reusability, and reproducibility by maintaining complete data provenance with unique identifiers. To promote findability, we have built an advanced search engine to identify datasets, samples, modules, and more, by filtering standardized metadata (e.g., find gene coexpression modules in human glioblastoma datasets that are significantly enriched with markers of T-cells). A detailed description of OMICON’s functionality is described on our Help page: theomicon.ucsf.edu/home/help. Through this functionality, OMICON seeks to build community and focus therapeutic efforts around reproducible analyses of transcriptional variation in normal human brain and brain tumors.
Read full abstract