Ontology-driven integrative analysis of omics data through Onassis

Eugenia Galeota,Mattia Pelizzola,Kamal Kishore

doi:10.1038/s41598-020-57716-1

Eugenia Galeota, Mattia Pelizzola + Show 1 more

Open Access

https://doi.org/10.1038/s41598-020-57716-1

Copy DOI

Abstract

Public repositories of large-scale omics datasets represent a valuable resource for researchers. In fact, data re-analysis can either answer novel questions or provide critical data able to complement in-house experiments. However, despite the development of standards for the compilation of metadata, the identification and organization of samples still constitutes a major bottleneck hampering data reuse. We introduce Onassis, an R package within the Bioconductor environment providing key functionalities of Natural Language Processing (NLP) tools. Leveraging biomedical ontologies, Onassis greatly simplifies the association of samples from large-scale repositories to their representation in terms of ontology-based annotations. Moreover, through the use of semantic similarity measures, Onassis hierarchically organizes the datasets of interest, thus supporting the semantically aware analysis of the corresponding omics data. In conclusion, Onassis leverages NLP techniques, biomedical ontologies, and the R statistical framework, to identify, relate, and analyze datasets from public repositories. The tool was tested on various large-scale datasets, including compendia of gene expression, histone marks, and DNA methylation, illustrating how it can facilitate the integrative analysis of various omics data.

Highlights

The plummeting cost of high-throughput sequencing experiments has led to a rapid accumulation of omics datasets in public repositories
The use of biomedical ontologies is typically restricted to the computer science domain, and with the exclusion of the popular Gene Ontology, they rarely reach the community of biologists, while this would greatly benefit from their support
With a process known as named entity recognition, Onassis associates free textual descriptions of publicly available samples to the concepts belonging to ontologies where entities of a given domain of interest are associated to a standard representation

Summary

Onassis Description

Onassis is available as a package within the R/Bioconductor project[14], a very popular software repository for the analysis of genomic data, used by both bioinformaticians and biologists. Once the semantic information is associated to the samples (based, for example, on the annotation of samples metadata with cell lines and disease conditions), Onassis uses it within the compare function, in order to direct the analysis of the actual omics data (Fig. 1). This requires that the omics data are stored within a score matrix, whose rows represent genomic units and whose columns represent samples. The following use cases will illustrate these analyses in detail

Use Cases

Discussion

Additional information

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Jan 20, 2020
Citations: 10	License type: open-access

R Discovery Prime

R Discovery Prime

Ontology-driven integrative analysis of omics data through Onassis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Networks and models for the integrated analysis of multi omics data
Sun Kim
-
Sun Kim Sun Kim
01 Dec 2016
01 Dec 2016

The current issues and future perspective of artificial intelligence for developing new treatment strategy in non-small cell lung cancer: harmonization of molecular cancer biology and artificial intelligence
Ichidai Tanaka ... Taiki Furukawa
Cancer Cell International | VOL. 21
Ichidai Tanaka, et. al.Ichidai Tanaka ... Taiki Furukawa
26 Aug 2021
Cancer Cell International | VOL. 21

Abstract 1565: OnPLS-based integrative proteogenomics analysis of lung squamous cell cancer
Fredrik Pettersson ... Anders Berglund
Cancer Research | VOL. 77
Fredrik Pettersson, et. al.Fredrik Pettersson ... Anders Berglund
01 Jul 2017
Abstract 1565: OnPLS-based integrative proteogenomics analysis of lung squamous cell cancer
Fredrik Pettersson ... Anders Berglund

Network-based analysis of omics with multi-objective optimization
Ettore Mosca ... Luciano Milanesi
Molecular BioSystems | VOL. 9
Ettore Mosca, et. al.Ettore Mosca ... Luciano Milanesi
01 Jan 2013
Molecular BioSystems | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ontology-driven integrative analysis of omics data through Onassis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports