FANTOM Consortium Research Articles

Beginning in 1995, early Internet pioneers proposed Digital Objects as encapsulations of data and metadata made accessible through persistent identifier resolution services (Kahn and Wilensky 2006). In recent years, this Digital Object Architecture has been extended to include the FAIR Guiding Principles (Wilkinson et al. 2016), resulting in the concept of a FAIR Digital Object (FDO), a minimal, uniform container making any digital resource machine-actionable. Intense effort is currently underway by a global community of experts to clarify definitions around an FDO Framework (FDOF) and to provide technical specifications (FAIR DO group 2020, FAIR Digital Object Forum 2020 , Bonino da Silva Santos (2021)) regarding their potential implementation. Beginning in 2009, nanopublications were independently conceived (Groth et al. 2010) as a minimal, uniform container making individual semantic assertions and their associated provenance metadata, machine-actionable. They represent minimal units of structured data as citable entities (Mons and Velterop 2009). A nanopublication consists of an assertion, the provenance of the assertion, and the provenance of the nanopublication (publication info). Nanopublications are implemented in and aligned with Semantic Web technologies such as RDF, OWL, and SPARQL (World Wide Web Consortium (W3C) 2015) and can be permanently and uniquely identified using resolvable Trusty URIs (Groth et al. 2021). The existing Nanopublication Server Network provides vital services orchestrating nanopublications (Kuhn et al. 2021) including identifier resolution, storage, search and access. Nanopublications can be used to expose quantitative and qualitative data, as well as hypotheses, claims, negative results, and opinions that are typically unavailable as structured data or go unpublished altogether. The first practical application of nanopublications occurred in 2014, with the publication of millions of nanopublications as part of the FANTOM5 Project (The FANTOM Consortium and the RIKEN PMI and CLST (DGT) 2014, Lizio et al. 2015). Since then, millions of real-world examples spanning diverse knowledge domains are now available on the nanopublication server network. Like nanopublication, the FDOF also posits an ultra-minimal approach to structured, self-contained, machine-readable data and metadata. An FDO consists of: the object itself (subsequently referred to here as the resource to avoid confusion with other meanings of the term “object”); the metadata describing the resource; and a globally unique and persistent identifier with predictable resolution behaviors. These two technologies share the same vision of a data infrastructure, and act as instances of Machine-Actionable Containers (MACs) that make use of minimal uniform standards to enable FAIR operations. Here, we compare the structure and computational behaviors of the existing nanopublication infrastructure, to those in the proposed FAIR Digital Object Framework. Although developed independently there are clear parallels between the vision and the approach of nanopublication and FDOF. Each aspires to minimal standards for the encapsulation of digital information into free-standing, publishable (citable, referenceable) entities. The minimal standards involve globally unique and persistent identifiers that resolve to standardized semantically enabled metadata descriptions that include machine actionable paths to the resource itself. At the same time, there are also differences. The scope of nanopublications is limited to the assertional data type and, as the name suggests, nanopublications should remain small in size (limited to single assertions as individual triples or small RDF graphs). In contrast FDOs are unlimited in their scope, accommodating digital resources of arbitrarily large size, type and complexity, so long as their type can be ontologically described. Furthermore, whereas nanopublications represent a moderately mature technology, the FDOF is a specification still under development. If it were possible to formally draw points of contact between the two approaches, then it would be possible to leverage the vast practical experience gained in the nanopublishing of assertions for the FDO community. Here, inspired by recent applications of nanopublications in the FIP Wizard tool (Schultes et al. 2020), and their extension to research claims (Kuhn 2022, McNamara 2022) and data using Schultes (2022a), Schultes (2022b), we attempt a point-by-point comparison of the specifications between nanopublication and FDOs. We find a remarkable congruence between the currently proposed FDO requirements and the existing nanopublication infrastructure, including several FDO-like qualities already embodied in the nanopublication ecosystem.

Read full abstract

Abstract Phosphatase and tensin homologue (PTEN) is a tumor suppressor gene that is frequently inactivated by deletion in prostate cancer (PCa). Occurring in around 20% of primary PCa tumors, and up to 50% in castration resistant tumors, it is the most frequent genomic aberration in PCa. Loss of PTEN activates the phosphoinositide 3-kinase-RAC-alpha serine/threonine-protein kinase (PI3K-AKT) pathway, which is associated with poor clinical outcomes. Despite the consequences of PTEN loss being well studied, most of what is known is restricted to protein-coding genes, with relatively little information about the role of non-coding genes. Using our recently created resource - the FC-R2 expression atlas, which encompasses expression levels for thousands of lncRNAs recently unveiled by the FANTOM consortium - we analyzed differential gene expression of PTEN-null vs PTEN-intact tumors with the goal of characterizing the molecular landscape of PTEN loss. First, we generated a consensus signature using two large PCa cohorts with experimentally validated PTEN status by Immunohistochemistry (IHC), applying a meta-analysis approach. This signature encompassed mainly protein coding genes due it being microarray based. In order to expand this signature beyond the coding genes, we relied on FC-R2-based TCGA-PRAD data. Since PTEN status was not available by IHC, we opted to call the status based on CNV. Then, we proceed to generate a PTEN-null signature using a generalized linear model approach. Both signatures were compared for concordance using correspondence-at-the-top plots and hypergeometric confidence intervals. Gene set enrichment analysis was performed in both signatures using a collection of obtained from the MSigDB database in order to characterize pathways involved in this event. Our results showed that the signature based on IHC validated samples agreed significantly with the CNV-based signature from TCGA for the genes in common. In the differential gene expression analysis on the TCGA cohort we observed 203 significant coding genes and 171 significant non-coding genes (FDR ≤ 0.01, LogFC ≥ 1). Notably, we identified several lncRNAs that have not been associated with PCa or PTEN loss, these include many classes of non-coding RNAs characterized by the FANTOM consortium such as: enhancers and promoters genes. Gene set enrichment analysis revealed that PTEN-null tumors are associated with epithelial-mesenchymal transition suggesting a possible role for these lncRNAs. In conclusion, by leveraging our resources, we were able to obtain comprehensive landscape of the PTEN loss in PCa for both the coding and non-coding counterpart. Furthermore, the association of many lncRNAs with PTEN loss was observed, many recently annotated by the FANTOM consortium, which can help us understand how genes are regulated in this event. In this work we show that despite being widely studied, there are still many components of PTEN loss in the form of lncRNAs highlighting potential markers for PTEN loss and clinical outcomes. Citation Format: Eddie Luidy Imada, Diego Fernando Sanchez, Wikum Dinalankara, Tamara Lotan, Luigi Marchionni. Screening PTEN-loss associated lncRNAs in prostate cancer [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 2535.

Read full abstract

FANTOM Consortium Research Articles

Related Topics

Articles published on FANTOM Consortium

Implications of differential transcription start site selection on chronic myeloid leukemia and prostate cancer cell protein expression.

The Comparative Anatomy of Nanopublications and FAIR Digital Objects

A Human atlas of smooth muscle cell gene expression; insights from the FANTOM consortium CAGE dataset

Pleiotropic Enhancers are Ubiquitous Regulatory Elements in the Human Genome.

Identification and Functional Characterization of Two Noncoding RNAs Transcribed from Putative Active Enhancers in Hepatocellular Carcinoma.

Transcriptional landscape of PTEN loss in primary prostate cancer

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

Modeling the Evolutionary Architectures of Transcribed Human Enhancer Sequences Reveals Distinct Origins, Functions, and Associations with Human Trait Variation.

Abstract 2535: Screening PTEN-loss associated lncRNAs in prostate cancer

Abstract 4491: Characterizing long non-coding RNA expression of tumor-infiltrating lymphocytes across solid cancers

MountainClimber Identifies Alternative Transcription Start and Polyadenylation Sites in RNA-Seq.

Identification of novel cerebellar developmental transcriptional regulators with motif activity analysis

Signatures of Recent Positive Selection in Enhancers Across 41 Human Tissues.

Causal Transcription Regulatory Network Inference Using Enhancer Activity as a Causal Anchor.

TELS: A Novel Computational Framework for Identifying Motif Signatures of Transcribed Enhancers

Abstract 2297: Differential analysis of gene expression across the human genome using recount2 and FANTOM-CAT

Human Enhancers Harboring Specific Sequence Composition, Activity, and Genome Organization Are Linked to the Immune Response.

Alternative start and termination sites of transcription drive most transcript isoform differences across human tissues.

Highlights of This Issue

FANTOM5 CAGE profiles of human and mouse reprocessed for GRCh38 and GRCm38 genome assemblies

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

FANTOM Consortium Research Articles

Related Topics

Articles published on FANTOM Consortium

Implications of differential transcription start site selection on chronic myeloid leukemia and prostate cancer cell protein expression.

The Comparative Anatomy of Nanopublications and FAIR Digital Objects

A Human atlas of smooth muscle cell gene expression; insights from the FANTOM consortium CAGE dataset

Pleiotropic Enhancers are Ubiquitous Regulatory Elements in the Human Genome.

Identification and Functional Characterization of Two Noncoding RNAs Transcribed from Putative Active Enhancers in Hepatocellular Carcinoma.

Transcriptional landscape of PTEN loss in primary prostate cancer

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

Modeling the Evolutionary Architectures of Transcribed Human Enhancer Sequences Reveals Distinct Origins, Functions, and Associations with Human Trait Variation.

Abstract 2535: Screening PTEN-loss associated lncRNAs in prostate cancer

Abstract 4491: Characterizing long non-coding RNA expression of tumor-infiltrating lymphocytes across solid cancers

MountainClimber Identifies Alternative Transcription Start and Polyadenylation Sites in RNA-Seq.

Identification of novel cerebellar developmental transcriptional regulators with motif activity analysis

Signatures of Recent Positive Selection in Enhancers Across 41 Human Tissues.

Causal Transcription Regulatory Network Inference Using Enhancer Activity as a Causal Anchor.

TELS: A Novel Computational Framework for Identifying Motif Signatures of Transcribed Enhancers

Abstract 2297: Differential analysis of gene expression across the human genome using recount2 and FANTOM-CAT

Human Enhancers Harboring Specific Sequence Composition, Activity, and Genome Organization Are Linked to the Immune Response.

Alternative start and termination sites of transcription drive most transcript isoform differences across human tissues.

Highlights of This Issue

FANTOM5 CAGE profiles of human and mouse reprocessed for GRCh38 and GRCm38 genome assemblies