Abstract

Utilizing graph analysis models and algorithms to exploit complex interactions over a network of entities is emerging as an attractive network analytic technology. In this paper, we show that traditional column or row-based trace analysis may not be effective in deriving deep insights hidden in the storage traces collected over complex storage applications, such as complex spatial and temporal patterns, hotspots and their movement patterns. We propose a novel graph analytics framework, GraphLens, for mining and analyzing real world storage traces with three unique features. First, we model storage traces as heterogeneous trace graphs in order to capture multiple complex and heterogeneous factors, such as diverse spatial/temporal access information and their relationships, into a unified analytic framework. Second, we employ and develop an innovative graph clustering method that employs two levels of clustering abstractions on storage trace analysis. We discover interesting spatial access patterns and identify important temporal correlations among spatial access patterns. This enables us to better characterize important hotspots and understand hotspot movement patterns. Third, at each level of abstraction, we design a unified weighted similarity measure through an iterative dynamic weight learning algorithm. With an optimal weight assignment scheme, we can efficiently combine the correlation information for each type of storage access patterns, such as random versus sequential, read versus write, to identify interesting spatial/temporal correlations hidden in the traces. Some optimization techniques on matrix computation are proposed to further improve the efficiency of our clustering algorithm on large trace datasets. Extensive evaluation on real storage traces shows GraphLens can provide broad and deep trace analysis for better storage strategy planning and efficient data placement guidance. GraphLens can be applied to both a single PC with multiple disks and a distributed network across a cluster of compute nodes to offer a few opportunities for optimization of storage performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.