Abstract
BackgroundPopulation Data BC (PopData) was established as a multi-university data and education resourceto support training and education, data linkage, and access to individual level, de-identified data forresearch in a wide variety of areas including human and community development and well-being.
 ApproachA combination of deterministic and probabilistic linkage is conducted based on the quality andavailability of identifiers for data linkage. PopData utilizes a harmonized data request and approvalprocess for data stewards and researchers to increase efficiency and ease of access to linked data.Researchers access linked data through a secure research environment (SRE) that is equipped witha wide variety of tools for analysis. The SRE also allows for ongoing management and control ofdata. PopData continues to expand its data holdings and to evolve its services as well as governanceand data access process.
 DiscussionPopData has provided efficient and cost-effective access to linked data sets for research. After twodecades of learning, future planned developments for the organization include, but are not limitedto, policies to facilitate programs of research, access to reusable datasets, evaluation and use of newdata linkage techniques such as privacy preserving record linkage (PPRL).
 ConclusionPopData continues to maintain and grow the number and type of data holdings available for research.Its existing models support a number of large-scale research projects and demonstrate the benefitsof having a third-party data linkage and provisioning center for research purposes. Building furtherconnections with existing data holders and governing bodies will be important to ensure ongoingaccess to data and changes in policy exist to facilitate access for researchers.
Highlights
Background Population DataBritish Columbia (BC) (PopData) was established as a multi-university data and education resource to support training and education, data linkage, and access to individual level, de-identified data for research in a wide variety of areas including human and community development and well-being.Approach A combination of deterministic and probabilistic linkage is conducted based on the quality and availability of identifiers for data linkage
Its existing models support a number of large-scale research projects and demonstrate the benefits of having a third-party data linkage and provisioning center for research purposes
This study examined data from over 80, 000 mothers and children, and over 500,000 adults
Summary
Providing efficient and cost-effective access to linked administrative data sets for research was a major driving force in establishing PopData and its predecessor. Based on two decades of learning, there are areas identified where further development and research is needed These include, but are not limited to: a) evolving governance and policy models to facilitate programs of research, low-risk accessible and/or cleaned data sets with an expedited review, and cloud computing for computational demanding analyses; b) training, development and implementation of data management and analysis of large data files through parallel processing techniques for researchers, c) evaluating the use of new data linkage techniques such as privacy preserving record linkage (PPRL); d) on-going updates/maintenance of metadata to ensure the research outcomes accurately reflect the state of the data; e) further training and education opportunities; and f) engaging with the public around appropriate rules for access to and use of complex linked data
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have