Abstract

AbstractThe Web has many forums for sharing personal data, but not for scientific data and not in a way that allows the data to be accessed by "machines as users." A Web of data could add tremendous value by integrating disparate disciplines or conduct data-driven queries. Doing this is very complex and requires more robust standards than currently exist. The intended user for most data is not a person; it is a software application that can manipulate the data into something useful for humans. Such software could be "search engines, analytic software, visualization tools, database back ends, and more." This need creates a much different requirement for standards than those that were developed for displaying web data to people. Data software needs a much greater understanding of context and that context has to be supplied alongside the data either through direct integration with the data or linking to a description of it in a persistent and accessible location. Data interoperability must be addressed at the beginning of developing systems because it is significantly harder and costlier to make these connections after both systems have separately implemented non-standardized data collections. Data interoperability must address three levels: legal (intellectual property rights), technical (computer languages and formats), and semantic (meaning of the data). The technical level is the furthest along, with the Semantic Web technologies. Getting scientists to agree on the semantic level could be nearly impossible. The legal level has the greatest opportunity by putting the data into the public domain. There are already precedents for this with genome data and the International Visual Observatory. Putting data into the public domain simplifies the implementation of the technical level. Libraries and publishers in the scholarly publishing community should lead the web of data initiative as they can ensure the connection, curation, and preservation needed. The NSF mandated data sharing could result in funding opportunities to build the web of data. But all involved must be in agreement not to replicate the copyright-controlled model that currently exists with books and journals.

Highlights

  • Robert Metcalfe, co-inventor of Ethernet and the founder of 3Com, is often attributed with the observation that the value of a telecommunications network is proportional to the square of the number of connected users of the system

  • Our personal data is mined by Facebook, by Twitter, by Google, to serve us with relevant advertisements to underpin many of the “free” services we access via smartphones and browsers

  • We don’t have the users yet to achieve the moment in the Metcalfe curve where value generation breaks above the cost line - and the reality is, given the small number of scholars and scientists, if we depend on more people being trained, we might not ever make it

Read more

Summary

Introduction

Robert Metcalfe, co-inventor of Ethernet and the founder of 3Com, is often attributed with the observation that the value of a telecommunications network is proportional to the square of the number of connected users of the system. Data - for our purposes, a catch-all word covering databases, datasets, and generally meaning here information that is gathered in the sciences as a result of either experimental work or environmental observation - requires a much more robust and complete set of standards to achieve the same "web" capabilities we take for granted in commerce and culture.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call