BioHackathon series in 2013 and 2014: improvements of semantic interoperability in life science data and services

Toshiaki Katayama ,Tatsuya Kushida ,Toshio Ohta ,Soichi Ogishima ,Hiroshi Mori ,K Bretonnel Cohen ,Issaku Yamada ,Masaaki Kotera ,Takeshi Kawashima ,Shujiro Okuda ,Leyla García ,Jesualdo Tomás Fernández‐Breis ,Masaaki Matsubara ,Shinya Suzuki ,Jerven Bolleman ,Mark Thompson ,Shoko Kawamoto ,Toshiaki Tokimatsu ,Fumitoshi Kato ,Michel Dumontier ,Yuki Moriya ,Kazuharu Arakawa ,Alex Kalderimis ,Yue Wang ,Gos Micklem ,Daniel Jamieson ,Kiyoko F Aoki‐Kinoshita ,Karin Verspoor ,Ikuo Uchiyama ,Pascale Gaudet ,Jin-Dong Kim ,Kotone Itaya ,Raoul J P Bonnal ,Nozomi Yamamoto ,Francesco Strozzi ,Atsuko Yamaguchi ,Takatomo Fujisawa ,Mark D Wilkinson ,Masayuki Yarimizu ,Shinobu Okamoto ,Robert Hoehndorf ,James Malone ,Hongyan Wu ,Hideya Kawaji ,Andrea Splendiani ,Hiroyo Nishide ,Daisuke Shinmachi ,Joachim Baran ,Akira R Kinjo ,Erick Antezana ,Maori Ito ,Shin Kawano ,Yasset Perez‐Riverol ,Masaki Banno ,Katsuhiko Murakami ,Robert Buels ,Hidemasa Bono ,Toyofumi Fujiwara ,Yusuke Komiyama ,Sarala Wimalaratne ,Nick Juty ,Simon Kocbek ,Yasunori Yamamoto ,Hirokazu Chiba ,Emi Hattori ,Junichi Takehara ,Takeru Nakazato ,Seiichiro Kawashima ,Matthew P Campbell ,Peter J A Cock ,Sayaka Mizutani ,Yosuke Nishimura ,Ono Hiromasa ,Simon Jupp ,Satoshi Mizuno ,Toshihisa Takagi

doi:10.12688/f1000research.18238.1

Abstract

Publishing databases in the Resource Description Framework (RDF) model is becoming widely accepted to maximize the syntactic and semantic interoperability of open data in life sciences. Here we report advancements made in the 6th and 7th annual BioHackathons which were held in Tokyo and Miyagi respectively. This review consists of two major sections covering: 1) improvement and utilization of RDF data in various domains of the life sciences and 2) meta-data about these RDF data, the resources that store them, and the service quality of SPARQL Protocol and RDF Query Language (SPARQL) endpoints. The first section describes how we developed RDF data, ontologies and tools in genomics, proteomics, metabolomics, glycomics and by literature text mining. The second section describes how we defined descriptions of datasets, the provenance of data, and quality assessment of services and service discovery. By enhancing the harmonization of these two layers of machine-readable data and knowledge, we improve the way community wide resources are developed and published. Moreover, we outline best practices for the future, and prepare ourselves for an exciting and unanticipatable variety of real world applications in coming years.

Highlights

Big data in the life sciences - especially from ‘omics’ technologies - is challenging researchers with scalability concerns in terms of computational and storage needs, while at the same time, there is a stronger drive towards the promotion of open data including the sharing of analyses and their outputs
During the 6th and 7th NBDC/DBCLS BioHackathons in 2013 and 2014, which were hosted by the National Bioscience Database Center (NBDC) and the Database Center for Life Science (DBCLS) in Japan, we focused on the improvement of Resource Description Framework (RDF) data for practical use in biomedical applications by developing guidelines, ontologies and tools especially for the genome, proteome, interactome and chemical domains
The utilization of Semantic Web technologies as a means for database integration was introduced in BioHackathon 20103

Summary

23 Sep 2019 report report

Vision , University of North Carolina at Chapel Hill, Chapel Hill, USA. Any reports and responses or comments on the article can be found at the end of the article. Knowledge, we improve the way community wide resources are developed and published. We outline best practices for the future, and prepare ourselves for an exciting and unanticipatable variety of real world applications in coming years. Keywords BioHackathon, Bioinformatics, Semantic Web, Web services, Ontology, Databases, Semantic interoperability, Data models, Data sharing, Data integration. This article is included in the Hackathons collection

Introduction

Conclusion

25. Bard JB

31. UniProt Consortium

81. Bodenreider O

INTRODUCTION