Abstract In 2005, we introduced the Gene Set Enrichment Analysis (GSEA) method to enable the identification and estimation of significance of activated biological pathways and processes in molecular data. Serving a community of over 300,000 registered users, and contributing scientific analysis to more than 30,000 publications, GSEA has become ubiquitous in gene expression analysis and particularly in cancer research. The power of GSEA relies, in a large part, on the Molecular Signatures Database (MSigDB) that provides tens of thousands of expertly curated and annotated gene sets representing cellular processes, canonical pathways, response signatures, etc., derived from prior studies, public data, and curated pathway databases. Here we describe an important recent addition to the GSEA/MSigDB resource. Historically, the GSEA-MSigDB resource has focused specifically on human data, offering guidance and support for analysis of model organism data through limited provision of orthology mapping files. In recognition of the importance of the mouse as a model organism for cancer research, we recently significantly expanded our support for mouse data in two ways. (1) We introduced Mouse MSigDB with a total of about 15,000 sets that are provided in the native mouse gene symbol space and are derived from published mouse datasets and mouse-focused resources. This allows GSEA analysis of mouse datasets without the need for orthology conversion. (2) For investigators who wish to use MSigDB’s human gene sets to analyze mouse data, or vice versa, we also implemented a new procedure for orthology mapping gene identifiers in an input dataset to match the identifiers of the gene sets. The new mapping considers the particular requirements of gene set enrichment analysis, and incorporates high confidence gene ortholog data from the Alliance of Genome Resources and ortholog-match refinement utilizing Ensembl’s ortholog datasets. Together, these new additions offer a substantial step forward for making mouse a first-class citizen of the GSEA-MSigDB ecosystem. The new Mouse MSigDB, and the full GSEA-MSigDB resource, is available at gsea-msigdb.org. Citation Format: Anthony S. Castanza, Jill M. Recla, David Eby, Alexander T. Wenzel, Helga Thorvaldsdottir, Carol J. Bult, Jill P. Mesirov. Extended support for model organisms: The next frontier for the molecular signatures database [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 6568.
Read full abstract