Abstract

Automatic data annotation eliminates most of the challenges we faced due to the manual methods of annotating sensor data. It significantly improves users’ experience during sensing activities since their active involvement in the labeling process is reduced. An unsupervised learning technique such as clustering can be used to automatically annotate sensor data. However, the lingering issue with clustering is the validation of generated clusters. In this paper, we adopted the k-means clustering algorithm for annotating unlabeled sensor data for the purpose of detecting sensitive location information of mobile crowd sensing users. Furthermore, we proposed a cluster validation index for the k-means algorithm, which is based on Multiple Pair-Frequency. Thereafter, we trained three classifiers (Support Vector Machine, K-Nearest Neighbor, and Naïve Bayes) using cluster labels generated from the k-means clustering algorithm. The accuracy, precision, and recall of these classifiers were evaluated during the classification of “non-sensitive” and “sensitive” data from motion and location sensors. Very high accuracy scores were recorded from Support Vector Machine and K-Nearest Neighbor classifiers while a fairly high accuracy score was recorded from the Naïve Bayes classifier. With the hybridized machine learning (unsupervised and supervised) technique presented in this paper, unlabeled sensor data was automatically annotated and then classified.

Highlights

  • IntroductionMobile crowd sensing (MCS) has revolutionized into an attractive way of gathering data [1]

  • Over the years, mobile crowd sensing (MCS) has revolutionized into an attractive way of gathering data [1]

  • This result was Figure shows the k-means centroid initialization using the random method. This result was obtained by obtained calculating distance the of each feature set to the centroids using bythe calculating distance of each feature setthe to the centroids using the Euclidean centroids generated from thefrom mean all feature sets insets each thedistance

Read more

Summary

Introduction

Mobile crowd sensing (MCS) has revolutionized into an attractive way of gathering data [1]. Smartphones, which are examples of mobile sensing devices, are integrated with several embedded sensors such as the accelerometer, the gyroscope, the magnetometer, a GPS, light sensors, proximity sensors, etc. These sensors gather data, which are useful in different domains [2]. In the activity recognition domain, most data collection approaches require users to be actively involved in the labeling process [10] This manual annotation approach requires users to be conscious when performing sensing activities since they need to provide labels to each activity (e.g., walking, running, sitting, etc.) [11].

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.