On utilizing an enhanced object partitioning scheme to optimize self-organizing lists-on-lists

O Ekaba Bisong,B John Oommen

doi:10.1007/s12530-020-09327-4

Abstract

With the advent of “Big Data” as a field, in and of itself, there are at least three fundamentally new questions that have emerged, namely the Artificially Intelligence (AI)-based algorithms required, the hardware to process the data, and the methods to store and access the data efficiently. This paper (The work of the second author was partially supported by NSERC, the Natural Sciences and Engineering Council of Canada. We are very grateful for the feedback from the anonymous Referees of the original submission. Their input significantly improved the quality of this final version.) presents some novel schemes for the last of the three areas. There have been thousands of papers written regarding the algorithms themselves, and the hardware vendors are scrambling for the market share. However, the question of how to store, manage and access data, which has been central to the field of Computer Science, is even more pertinent in these days when megabytes of data are being generated every second. This paper considers the problem of minimizing the cost of retrieval using the most fundamental data structure, i.e., a Singly-Linked List (SLL). We consider a SLL in which the elements are accessed by a Non-stationary Environment (NSE) exhibiting the so-called “Locality of Reference”. We propose a solution to the problem by designing an “Adaptive” Data Structure (ADS), which is created by utilizing a composite of hierarchical data “sub”-structures to constitute the overall data structure (In this paper, the primitive data structure is the SLL. However, these concepts can be extended to more complicated data structures.). In this paper, we design hierarchical Lists-on-Lists (LOLs) by assembling a SLL into a hierarchical scheme that results in SLLs on SLLs (SLLs-on-SLLs) comprising of an outer-list and sublist contexts. The goal is that elements that are more likely to be accessed together are grouped within the same sub-context, while the sublists themselves are moved “en masse” towards the head of the list-context to minimize the overall access cost. This move is carried out by employing the “de-facto” list re-organization schemes, i.e., the Move-To-Front (MTF) and Transposition (TR) rules. To achieve the clustering of elements within the sublists, we invoke the Object Migration Automaton (OMA) family of reinforcement schemes from the theory of Learning Automata (LA). They capture the probabilistic dependence of the elements in the data structure, receiving query accesses from the Environment. We show that SLLs-on-SLLs augmented with the Enhanced OMA (EOMA) minimizes the retrieval cost for elements in NSEs, and are superior to the stand-alone MTF and TR schemes, and also superior to the OMA-augmented SLLs-on-SLLs operating in such Environments.

Full Text