UK Data Archive Research Articles

BackgroundData discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these “experts.” Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research.ObjectiveThe cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the “Google generation” than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive.MethodsTwo user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is “Google-like,” enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface.ResultsUsing a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task × interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F 1,19=37.3, P<.001), with a main effect of task (F 3,57=6.3, P<.001). Further, participants completed the task significantly faster using the Web search interface (F 1,19=18.0, P<.001). There was also a main effect of task (F 2,38=4.1, P=.025, Greenhouse-Geisser correction applied). Overall, participants were asked to rate learnability, ease of use, and satisfaction. Paired mean comparisons showed that the Web search interface received significantly higher ratings than the traditional search interface for learnability (P=.002, 95% CI [0.6-2.4]), ease of use (P<.001, 95% CI [1.2-3.2]), and satisfaction (P<.001, 95% CI [1.8-3.5]). The results show superior cross-domain usability of Web search, which is consistent with its general familiarity and with enabling queries to be refined as the search proceeds, which treats serendipity as part of the refinement.ConclusionsThe results provide clear evidence that data science should adopt single-field natural language search interfaces for variable search supporting in particular: query reformulation; data browsing; faceted search; surrogates; relevance feedback; summarization, analytics, and visual presentation.

Read full abstract

The quality of life (QoL) of children with sickle cell anaemia (SCA) in the United Kingdom has not been examined, and a discrepancy measure based on Gap theory has rarely been used. This study investigated whether (1) child self-reports of QoL using a discrepancy measure (the Generic Children's QoL Measure; GCQ) are lower than those from healthy children, (2) proxy reports from parents and health care professionals are lower than child self-reports, and (3) demographic and disease severity indicators are related to QoL. An interdependent groups, cross-sectional design was implemented. Seventy-four children with SCA, their parent, and members of their health care team completed the GCQ. Demographic and disease severity indicators were recorded. GCQ data from healthy children were obtained from the UK Data Archive. Contrary to past research, when examining generic discrepancy QoL, children with SCA did not report a lower QoL than healthy children, and parent- and health care professional-proxy reports were not lower than child self-reports. Few of the demographic and disease severity indicators were related to QoL. Proxy reports may be used to gain a more complete picture of QoL, but should not be a substitute for self-reports. The explanation for the relatively high levels of QoL reported is not clear, but children with SCA may have realistic expectations about their ideal-self, place greater emphasis on aspects other than health in shaping their QoL, and define achievements within the limits of their illness. Future research should focus on psychological factors in explaining QoL. Statement of contribution What is already known on this subject? Children with sickle cell disease (SCD) generally have a reduced QoL compared with healthy children, but there appears to be no research measuring QoL in paediatric SCD in the United Kingdom. Proxy QoL reports from parents are often lower than child self-reports, but there is less research examining proxy reports from health care professionals. Previous research has measured paediatric QoL using measures of current health-related QoL, but this is not in line with the WHO's definition of QoL as the discrepancy between current state and expectations. What does this study add? Children with Sickle cell anaemia do not have an impaired discrepancy QoL; they may have realistic expectations about their ideal-self and define achievements within the limits of their illness. Health care professionals are able to gauge a SCA child's discrepancy QoL better than parents. The GCQ (a generic discrepancy measure of QoL) takes into account expectations about ideal QoL and does not emphasize health; it may be of use to Psychologists working with SCA children.

Read full abstract

UK Data Archive Research Articles

Articles published on UK Data Archive

Sexual identity data collection and access in UK population-based studies

The financial maintenance of social science data archives: Four case studies of long‐term infrastructure work

Intimate Physical Contact between People from Different Households During the COVID-19 Pandemic: A Mixed-Methods Study from a Large, Quasi-Representative Survey (Natsal-Covid)

Knowing what to do and how to do it: High transparency and careful curation of data and metadata

The Emerging Adults Gambling Survey: study protocol

The Emerging Adults Gambling Survey: study protocol.

The Life Cycle of Structural Biology Data

Research data management in academic institutions: A scoping review.

Depression in young adults with chronic somatic illness – An analysis of 1970 British Cohort Study

Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search.

Europejskie archiwa danych jakościowych

A88 Young driver crash rates in Great Britain: trends and comparisons between countries

A89 Girls crash too: trends and comparisons between male and female young driver crash rates in Great Britain

What do we know about fruit and vegetable consumption in the UK? Trends from the National Diet and Nutrition Survey Rolling Programme (NDNS RP)

Assessing the quality of life of children with sickle cell anaemia using self-, parent-proxy, and health care professional-proxy reports.

UK National Diet and Nutrition (NDNS) Survey: ad‐hoc cross‐sectional survey to sustainable Rolling Programme (RP) for surveillance (Y6‐9 2013‐17) (LB467)

Segmenting the betting market in England

PP41 Alcohol-Related ‘risk Factors’ Predict Under-Reporting of Alcohol Consumption more than Socio-Demographic Factors: Evidence from a Mixed-Methods Study

OP01 Did Health Inequality Increase in English Children and Young People between 1999 and 2009? Evidence from two Cross-Sectional Surveys and Inpatient Activity Data

A critical appraisal of ethical sedative use for dying patients: comparative ethnographic study of three palliative care units

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

UK Data Archive Research Articles

Articles published on UK Data Archive

Sexual identity data collection and access in UK population-based studies

The financial maintenance of social science data archives: Four case studies of long‐term infrastructure work

Intimate Physical Contact between People from Different Households During the COVID-19 Pandemic: A Mixed-Methods Study from a Large, Quasi-Representative Survey (Natsal-Covid)

Knowing what to do and how to do it: High transparency and careful curation of data and metadata

The Emerging Adults Gambling Survey: study protocol

The Emerging Adults Gambling Survey: study protocol.

The Life Cycle of Structural Biology Data

Research data management in academic institutions: A scoping review.

Depression in young adults with chronic somatic illness – An analysis of 1970 British Cohort Study

Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search.

Europejskie archiwa danych jakościowych

A88 Young driver crash rates in Great Britain: trends and comparisons between countries

A89 Girls crash too: trends and comparisons between male and female young driver crash rates in Great Britain

What do we know about fruit and vegetable consumption in the UK? Trends from the National Diet and Nutrition Survey Rolling Programme (NDNS RP)

Assessing the quality of life of children with sickle cell anaemia using self-, parent-proxy, and health care professional-proxy reports.

UK National Diet and Nutrition (NDNS) Survey: ad‐hoc cross‐sectional survey to sustainable Rolling Programme (RP) for surveillance (Y6‐9 2013‐17) (LB467)

Segmenting the betting market in England

PP41 Alcohol-Related ‘risk Factors’ Predict Under-Reporting of Alcohol Consumption more than Socio-Demographic Factors: Evidence from a Mixed-Methods Study

OP01 Did Health Inequality Increase in English Children and Young People between 1999 and 2009? Evidence from two Cross-Sectional Surveys and Inpatient Activity Data

A critical appraisal of ethical sedative use for dying patients: comparative ethnographic study of three palliative care units