Abstract

AbstractIn recent times, data science has seen a rapid increase in the need for individuals and teams to analyze and manipulate data at scale for various scientific and commercial purposes. Groups often collaboratively analyze datasets, thereby leading to a proliferation of dataset versions at each stage of iterative exploration and analysis. Thus, an efficient collaborative system compatible with handling various versions is needed rather than the current most often used ad-hoc versioning mechanism. In a collaborative database, all the collaborators working together on a project need to interact together to perform extensive curation activities. In a typical scenario, when an update is made by one of the collaborators, it should become visible to the whole team for possible comments and modifications, which in turn aid the data custodian in making a better decision. Relational databases provide efficient data management and querying. However, it lacks various features to support efficient collaboration. In these databases, the approval and authorization of updates are based completely on the identity of the user, e.g., via SQL GRANT and REVOKE commands. In this paper, we present a framework well suited for collaboration and implemented on top of relational databases that will enable the team to manage as well as query the dataset versions efficiently.KeywordsCollaborative databasesVersioningData cellDatabase management

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call