Due to the proliferation of GPS-enabled devices in vehicles or with people, large amounts of position data are recorded every day and the management of such mobility data, also called trajectories, is a very active research field. A lot of effort has gone into discovering “semantics” from the raw geometric trajectories by relating them to the spatial environment or finding patterns, for example, by data mining techniques. A question is how the resulting “meaningful” trajectories can be represented or further queried. In this article, we propose a systematic study of annotated trajectory databases . We define a very simple generic model called symbolic trajectory to capture a wide range of meanings derived from a geometric trajectory. Essentially, a symbolic trajectory is just a time-dependent label; variants have sets of labels, places, or sets of places. They are modeled as abstract data types and integrated into a well-established framework of data types and operations for moving objects. Symbolic trajectories can represent, for example, the names of roads traversed obtained by map matching, transportation modes, speed profile, cells of a cellular network, behaviors of animals, cinemas within 2km distance, and so forth. Symbolic trajectories can be combined with geometric trajectories to obtain annotated trajectories. Besides the model, the main technical contribution of the article is a language for pattern matching and rewriting of symbolic trajectories. A symbolic trajectory can be represented as a sequence of pairs (called units) consisting of a time interval and a label. A pattern consists of unit patterns (specifications for time interval and/or label) and wildcards, matching units and sequences of units, respectively, and regular expressions over such elements. It may further contain variables that can be used in conditions and in rewriting. Conditions and expressions in rewriting may use arbitrary operations available for querying in the host DBMS environment, which makes the language extensible and quite powerful. We formally define the data model and syntax and semantics of the pattern language. Query operations are offered to integrate pattern matching, rewriting, and classification of symbolic trajectories into a DBMS querying environment. Implementation of the model using finite state machines is described in detail. An experimental evaluation demonstrates the efficiency of the implementation. In particular, it shows dramatic improvements in storage space and response time in a comparison of symbolic and geometric trajectories for some simple queries that can be executed on both symbolic and raw trajectories.
Read full abstract