Abstract

Behavioral analysis (BA) on ubiquitous sensor data is the task of finding the latent distribution of features for modeling user-specific characteristics. These characteristics, in turn, can be used for a number of tasks including resource management, power efficiency, and smart home applications. In recent years, the employment of topic models for BA has been found to successfully extract the dynamics of the sensed data. Topic modeling is popularly performed on text data for mining inherent topics. The task of finding the latent topics in textual data is done in an unsupervised manner. In this work we propose a novel clustering technique for BA which can find hidden routines in ubiquitous data and also captures the pattern in the routines. Our approach efficiently works on high dimensional data for BA without performing any computationally expensive reduction operations. We evaluate three different techniques namely Latent Dirichlet Allocation (LDA), the Non-negative Matrix Factorization (NMF), and the Probabilistic Latent Semantic Analysis (PLSA) for comparative study. We have analyzed the efficiency of the methods by using performance indices like perplexity and silhouette on three real-world ubiquitous sensor datasets namely, the Intel Lab, Kyoto, and MERL. Through rigorous experiments, we achieve silhouette scores of 0.7049 over the Intel Lab dataset, 0.6547 over the Kyoto dataset, and 0.8312 over the MERL dataset for clustering. In these cases, however, it is di cult to validate the results obtained as the datasets do not contain any ground truth information. Towards that, we investigate a self-supervised method that will be capable of capturing the inherent ground truths that are available in the dataset. We design a self-supervised technique which we apply on datasets containing ground truth and also without. We see that our performance on data without ground truth differs from that with ground truth by approximately 8% (F-score) hence showing the efficacy of self-supervised techniques towards capturing ground truth information.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call