Abstract

Science has entered the era of Big Data with new challenges related to data governance, stewardship, and management. The existing data governance practices must catch up to ensure proper data management. Existing data governance policies and stewardship best practices tend to be disconnected from operational data management practices and enforcement and mainly exist in well-meaning documents or reports. These governance policies are, at best, partially implemented and rarely monitored or audited. In addition, existing governance policies keep adding additional data management steps that require a human, ‘a data steward’, in the loop, and the cost of data management can no longer scale proportionately with the current and future increased data volume and complexity. The goal for developing an updated data governance framework is to modernize scientific data governance to the reality of Big data and align it with the current technology trends such as cloud computing and AI. The goals of this framework are two folds. One is to ensure thoroughness that the governance adequately covers the entire data life cycle. Two, provide a practical approach that offers a consistent and repeatable process for different projects. Three core principles ground this framework. First, focus on just enough governance and prevent data governance from becoming a roadblock toward the scientific process. Remove any unnecessary processes and steps. Second, automate data management steps where possible. Actively remove steps that require  ‘human in the loop’ within the management process to be efficient and scale with increasing data. Third, all the processes should continually be optimized using quantified metrics to streamline the monitoring and auditing workflows. 

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.