Abstract

The emergence of complex datasets permeates versatile research disciplines leading to the necessity to develop methods for tackling complexity through finding the patterns inherent in datasets. The challenge lies in transforming the extracted patterns into pragmatic knowledge. In this paper, new information entropy measures for the characterization of the multidimensional structure extracted from complex datasets are proposed, complementing the conventionally-applied algebraic topology methods. Derived from topological relationships embedded in datasets, multilevel entropy measures are used to track transitions in building the high dimensional structure of datasets captured by the stratified partition of a simplicial complex. The proposed entropies are found suitable for defining and operationalizing the intuitive notions of structural relationships in a cumulative experience of a taxi driver’s cognitive map formed by origins and destinations. The comparison of multilevel integration entropies calculated after each new added ride to the data structure indicates slowing the pace of change over time in the origin-destination structure. The repetitiveness in taxi driver rides, and the stability of origin-destination structure, exhibits the relative invariance of rides in space and time. These results shed light on taxi driver’s ride habits, as well as on the commuting of persons whom he/she drove.

Highlights

  • The omnipresent phenomenon of complexity permeates contemporary research topics in physical, social, biological, informational sciences, as well as the industry sectors, and it is followed by the explosion of large quantities of data about complex systems

  • The structure of a dataset is mathematically represented as a simplicial complex, providing us the opportunity to apply the rich apparatus of algebraic topology [1]

  • The collection of elements of datasets builds the structure, which captures the patterns embedded within the dataset. This enables us to build a multidimensional structure of a simplicial complex and analyze it using an appropriate apparatus grounded in algebraic topology

Read more

Summary

Introduction

The omnipresent phenomenon of complexity permeates contemporary research topics in physical, social, biological, informational sciences, as well as the industry sectors, and it is followed by the explosion of large quantities of data about complex systems. In order to characterize the (in)distinguishability of substructures embedded in the dataset, we introduced the information entropy measures, to quantify the information that emerges from the built-in similarity relationships of dataset elements represented by the connectivity embedded. Entropy 2017, 19, 172 at different levels of the hierarchical data structure. The structure of a dataset is mathematically represented as a simplicial complex, providing us the opportunity to apply the rich apparatus of algebraic topology [1]. The introduced vector-like entropies capture the (in)distinguishability of different layers of the rigorously partitioned structure of the dataset and, further, indicate the way that the changes of data affect the internal structural relationships of the dataset. Our objective is to relate the structure of a simplicial complex, via entropy measures, to the pattern formation of dependencies between aggregations of complex datasets

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call