Abstract

For many years, HCI research has been known to suffer from a replication crisis, due to the lack of openly available datasets and accompanying code. Recent research has identified several barriers that prevent the wider sharing of primary research materials in HCI, but such material does, in fact, exist. Interested in the field of mobile text entry research, and largely hindered by the lack of access to participants due to the COVID-19 pandemic, the exploration of a recently published open gaze and touch dataset became an appealing prospect. This paper demonstrates the numerous problems and the extent of required effort related to understanding, sanitising and utilising open data in order to produce meaningful outcomes from it, through a detailed account of working with this dataset. Despite these issues, the paper demonstrates the value of open data as a means to produce novel contributions, without the need for additional new data (in this case, an unsupervised learning pipeline for the robust detection of gaze clusters in vertically distinct areas of interest). Framing the experience of this case study under a dataset lifecycle model intended for ML open data, a set of useful guidelines for researchers wishing to exploit open data is derived. A set of recommendations is also proposed, about the handling of papers accompanied by data, by conferences and journals in the future, Finally, the paper proposes a set of actions for the mobile text entry community, in order to facilitate data sharing across its members.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call