Concurrency versus consistency in NoSQL databases

Sonal Kanungo,Rustom D Morena

doi:10.32629/jai.v7i3.936

Abstract

<p>With the advent of cloud services, the proliferation of data has reached unprecedented levels. The load distribution across multiple servers, driven by web and mobile applications, has become a defining characteristic of contemporary data management. In contrast to this surge in data complexity, traditional relational databases have proven inadequate in handling vast amounts of unstructured data due to their inherent focus on structured data models. Additionally, the concept of clustering, vital for efficient unstructured data management, eluded relational databases, rendering them ill-equipped for customized clustering techniques and the optimal execution of queries. SQL (Structured Query Language) databases earlier emerged as a groundbreaking solution, introducing the relational database model that organized data into structured tables. They employed ACID (atomicity, consistency, isolation, durability) properties to maintain data integrity and enabled intricate querying through SQL. However, as applications grew in complexity, SQL databases encountered hurdles in handling various data types, rapid data expansion, and concurrent workloads. The limitations of SQL databases propelled the rise of NoSQL (Not Only Structured Query Language) databases, which prioritized adaptability, scalability, and performance. NoSQL databases embraced diverse data models such as documents, key-values, column families, and graphs, enabling effective management of structured, semi-structured, and unstructured data. The transition to NoSQL databases was justified by several factors; horizontally scaled across nodes, handling extensive read-write operations effectively, Agile development of accommodating changing data structures without schema constraints, optimization for specific tasks, providing low-latency access and high throughput, dynamic schemas aligned with modern iterative development, promoting adaptability, and adeptly managed diverse data types, spanning text, geospatial, time-series, and multimedia data. These databases are purposefully designed to accommodate the escalating demands of data storage. Notably, this data emanates from diverse nodes and is susceptible to concurrent access by numerous users. However, a critical challenge surfaces as the data present on one node may diverge from its counterpart on another node replica. In this context, the simultaneous execution of database operations, while preserving the integrity of the data, emerges as a pivotal concern. Maintaining data consistency amid concurrent access hinges upon the synchronization of operations across all replica nodes. Achieving this synchronization necessitates the adoption of a robust concurrency control technique. Concurrency control acts as the linchpin for upholding accuracy and reliability within a system where operations unfold concurrently. Hence, the focal point of this investigation lies in examining the assorted concurrency control methodologies employed by NoSQL systems. The objective is to dissect the intricate interplay between concurrency and consistency, shedding light on the strategies these systems employ to strike an optimal balance between the two. In summation, as the landscape of data management witnesses an era of exponential growth catalyzed by cloud services, the dynamics of load distribution and unstructured data have necessitated a departure from traditional relational databases. NoSQL databases have risen to the fore, demonstrating the ability to grapple with these challenges. However, the quest for concurrent data access without compromising data consistency propels the exploration of various concurrency control methods. The aim of this study is to look at some of the different concurrency control approaches employed by NoSQL systems, highlighting how they priorities concurrency and consistency.</p>

Full Text