Abstract

Searching for scientific datasets is a prominent task in scholars' daily research practice. A variety of data publishers, archives and data portals offer search applications that allow the discovery of datasets. The evaluation of such dataset retrieval systems requires proper test collections, including questions that reflect real world information needs of scholars, a set of datasets and human judgements assessing the relevance of the datasets to the questions in the benchmark corpus. Unfortunately, only very few test collections exist for a dataset search. In this paper, we introduce the BEF-China test collection, the very first test collection for dataset retrieval in biodiversity research, a research field with an increasing demand in data discovery services. The test collection consists of 14 questions, a corpus of 372 datasets from the BEF-China project and binary relevance judgements provided by a biodiversity expert.

Highlights

  • Dataset search and data reuse are becoming more important in scholars' research practice

  • Evaluations with test collections are required to determine whether a dataset retrieval system supports its users well in identifying relevant datasets

  • Driven by the highly influential and annual Information Retrieval Challenge, TREC, a multitude of test collections are available for the retrieval of publications and websites in different application domains

Read more

Summary

A Test Collection for Dataset Retrieval in Biodiversity Research

Felicitas Löffler‡, Andreas Schuldt§, Birgitta König-Ries‡,|,¶, Helge Bruelheide#,¶, Friederike Klan|,¤.

Introduction
Related Work
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call