Abstract

There is demand among policy-makers for the use of state education longitudinal data systems, yet laws and policies regulating data disclosure limit access to such data, and security concerns and risks remain high. Well-developed synthetic datasets that statistically mimic the relations among the variables in the data from which they were derived, but which contain no records that represent actual persons, present a viable solution to these laws, policies, concerns, and risks. We present a case study in the development of a synthetic data system and highlight potential applications of synthetic data. We begin with an overview of synthetic data, what it is, how it has been utilized thus far, and the potential benefits and concerns in its application to education data systems. We then describe our federally-funded project, proposing the steps required to synthesize a statewide longitudinal data system covering high school, postsecondary, and workforce data. Last, for use as a template for other agencies considering synthetic data, we review the challenges we have confronted in the development of our synthetic data system for research and policy evaluation purposes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.