"At present, the volume of processed information increases extremely every year and can already reach hundreds of terabytes or several petabytes. This amount of data is always seen in the field of data analysis, modeling, testing, artificial intelligence, etc. Thus, the problem of saving and improving the performance of data processing systems becomes relevant. To solve this problem, many options for the internal organization of the database and DBMS were considered in this field. The main disadvantage of relational databases with row organization when processing large arrays is the irrational use of file system resources and RAM. One of the options for increasing the efficiency of processing large amounts of information is a columnar data organization model. This model proposes storing data in the form of several files corresponding to the data of each column, which in turn are stored in the form of a key-value. This data organization allows you to optimize the amount of information read from the database, as well as use compression, which has a positive effect on system performance. In this paper, the features of the columnar databases organization are experimentally studied, differences from the traditional row organization are considered, the main advantages and disadvantages of both organization options, their architectural features, which provide accelerated data processing, are analyzed. In the paper, a comparative analysis of the speed of information processing was carried out for various options for organizing a database using the example of a MySQL row database and a ClickHouse column database when executing queries of various types and complexity. Based on the experimental studies result, a system architecture with the integrated use of row and column databases was proposed to achieve universality and optimal performance in transactional systems such as OLTP, taking into account the growth in the volume of processed information. The advantages of the proposed complex database management system with different types of data organization is the achievement of a certain level of versatility and increased performance in transactional systems. The disadvantage of such a system may be its volume of data and the complexity of the organization, as well as problems with ensuring reliability. Consideration of reliability problems is a promising area of research. Theoretically, on the basis of the proposed complex system, it would be possible to create a separate type of database management system. It’s necessary to develop a certain external control level that organizes the operation of a complex of two different type databases, then to design a common interface and connect both databases in a modular way to test various combinations. This approach is quite possible, since some database management systems, such as ClickHouse, have several interfaces for interacting with others, such as MySQL, PostgreSQL."
Read full abstract