Memsy: A Personal Resource Management Infrastructure

Matthias Geel

doi:10.3929/ethz-a-010603237

Abstract

Undeniably, the information age and its main driver, the Internet, has enabled some great innovations in terms of how we access and share information. We have more computation power than ever, more storage space and more ways to transmit and access information. However, the ability to produce and share information on a large scale has also created some unique challenges that end-users have to deal with. Not only do we face an immense growth of personal information (e.g. images, music, documents, e-mails), we also actively amplify the problem of information fragmentation by using an abundance of different devices and web applications to organise it. Our data is spread among services like DropBox, Facebook or Flickr, stored on hard disks or flash drives and managed by desktops, notebooks, tablets and mobile devices. As a result, keeping track of personal resources across devices and services has become increasingly difficult. We argue that todays consumer file systems and desktop-centric PIM solutions are not adequate to effectively organise personal resources that reside on multiple different devices and/or online services. In this thesis, we explore the implications of a version-aware environment with the goal of providing alternative access paths to personal files based on provenance information. Furthermore, we experiment with different organisational schemes that can be employed orthogonal to folder structures in order to manage those resources. To that end, we propose a solution called Memsy, a new personal resource management environment that is comprised of three subsystems: a version-aware infrastructure, a personal resource management layer and a personal resource graph. While we focus mainly on personal resources represented by files, we later expand the notion of resources to be independent of the nature of the representations. At its core, Memsy is a file provenance system which maintains a unified view of a users personal information space across devices and services. It helps users to keep track of the whereabouts of their files and enables them to navigate between versions, variants and related resources of those files more effectively. To achieve this, we propose the concept of a file history graph, a lightweight, implicit versioning mechanism for files that retains a history of the cryptographic hashes of all encountered file versions and remembers the last known storage location(s) for each of them. By observing the local file systems and cloud storage services in the background, our system detects common file operations and consolidates that information with the central file history graph to help users locate the latest versions of their personal files from within their familiar desktop environment. However, in a distributed and highly fragmented personal information space it is almost unavoidable that files get modified outside of the observable environment, resulting in missing links in their provenance chains. As a possible remedy, we propose the use of similarity metrics to infer those missing relationships a posteriori. One example

Full Text