Abstract

Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases, to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, graphs have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle graph structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP graphs? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze graph data from different perspectives and with multiple granularities. In this paper, we argue that it is critically important to OLAP graph structured data and propose a novel Graph OLAP framework. According to this framework, given a graph dataset with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the graphs can be generalized/specialized dynamically, offering multiple, versatile views of the data. The contributions of this work are three-fold. First, starting from basic definitions, i.e., what are dimensions and measures in the Graph OLAP scenario, we develop a conceptual framework for data cubes on graphs. We also look into different semantics of OLAP operations, and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how a graph cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying “network” properties of the graph dataset are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting and informative OLAP of graphs, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven Graph OLAP.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.