Abstract
Objective and ApproachData linkage centres receive, hold and provision data in a wide variety of formats. Using a consistent, flexible and open metadata standard for data holdings can enable a center to clearly and unambiguously describe their data holdings and facilitate processing pipelines. One such framework is the Frictionless data standard. We present our experience at our data centre implementing this standard for both internal data management and for communication with external data providers and third parties. ResultsOur team has replaced tool-specific metadata files with standard-compliant files to make our pipelines more modular and less fragile. Since the standard is based on JSON, most programming languages will support reading and writing metadata files and we are not restricted to any software. This work has also made our internal tooling interoperable with that of external partners. We find that the standard is well equipped to handle CSV and other delimited files. Our center uses a mix of file types including fixed-width text files which do not have an explicit specification within the standard, but its extensibility has allowed us to define our own file type while staying within the ecosystem. We have also developed tools to convert JSON metadata files to excel workbooks for a more human-friendly format. ConclusionsThe Frictionless data standard can be a powerful tool for data centers in organizing data and building processing pipelines. ImplicationsAdopting the Frictionless data standard for metadata files has streamlined our internal processes and improved internal and external communication.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.