Ubiquitous Knowledge Discovery

João Gama,Michael May

doi:10.3233/ida-2010-0452

Abstract

Over the last years, ubiquitous computing has started to create a new world of small, heterogeneous, and distributed devices that have the ability to sense, to communicate and interact in ad hoc or sensor networks and peer2peer systems. These large scale distributed systems have in many cases to interact in real-time with their users. Knowledge Discovery in ubiquitous environments (KDubiq) is an emerging area of research at the intersection of the two major challenges of highly distributed and mobile systems and advanced knowledge discovery systems. It aims to provide a unifying framework for systematically investigating the mutual dependencies of otherwise quite unrelated technologies employed in building next-generation intelligent systems: machine learning, data mining, sensor networks, grids, P2P, data stream mining, activity recognition, Web 2.0, privacy, user modeling and others. In a fully ubiquitous setting, the learning typically takes place in situ, inside the small devices. Its characteristics are quite different from the current mainstream data mining and machine learning. Instead of offline-learning in a batch setting, sequential learning, anytime learning, real-time learning, online learning etc. under real-time constraints from ubiquitous and distributed data is needed. Instead of learning from stationary distributions, concept drift is the rule rather than the exception. Instead of large stand-alone workstations, learning takes place in unreliable, highly resource constrained environments in terms of battery power and bandwidth. The goal of this special issue is to promote an interdisciplinary forum for researchers who deal with sequential learning, anytime learning, real-time learning, online learning, etc. from ubiquitous and distributed data. Distributed Learning from Data Streams is a recent and increasing research area with challenging applications and contributions from fields like Data Bases, Data Mining, Machine Learning, and Statistics. The selected papers cover a large spectrum in the research of Ubiquitous Knowledge Discovery that goes from frequent pattern mining algorithms, distributed clustering, outlier detection to multi-relational learning The common concept in all the papers is that learning occurs while data continuously flows eventually produced from distributed sources. The first paper, Gama and Pereira presents a new distributed clustering algorithm which reduces both the dimensionality and the communication burden between sensors. Bifet and Gavalda propose new algorithms for adaptively mining closed rooted trees, both labeled and unlabeled, from data streams that change over time. Closed patterns are powerful representatives of frequent patterns, since they eliminate

Full Text