Abstract
In this letter, we propose a multivariate time-series classification system that fuses multirate sensor measurements within the latent space of a deep neural network. In our network, the system identifies the surface category based on audio and inertial measurements generated from the surface impact, each of which has a different sampling rate and resolution in nature. We investigate the feasibility of categorizing ten different everyday surfaces using a proposed convolutional neural network, which is trained in an end-to-end manner. To validate our approach, we developed an embedded system and collected 60 000 data samples under a variety of conditions. The experimental results obtained exhibit a test accuracy for a blind test dataset of 93%, taking less than 300 ms for end-to-end classification in an embedded machine environment. We conclude this letter with a discussion of the results and future direction of research.
Highlights
A N intelligent system that incorporates multiple sensors for time-series classification purposes often requires a sophisticated multisensor fusion method in that measurements from each sensor are generally not sampled at the same rate
We employed an random forest (RF) classifier owing to its robustness against an overfitting [9]
We proposed a multivariate time-series classification system that fuses heterogeneous sensor measurements using a late fusion convolutional neural network (CNN)
Summary
A N intelligent system that incorporates multiple sensors for time-series classification purposes often requires a sophisticated multisensor fusion method in that measurements from each sensor are generally not sampled at the same rate. Multirate sensor measurements can be fused using a conventional approach (e.g., a direct weighted fusion), such methods often result in a limited applicability owing to their simplicity [1]. Taking advantage of recent deep learning capabilities, a recent study proposed a temporal binding approach that classifies audio-visual information based on an efficient multimodal fusion [3]. A set of temporal information, including the RGB flow and audio, is efficiently fused in a latent space of a convolutional neural network (CNN) such that all modalities are trained simultaneously. To the best of our knowledge, few studies have addressed the time-series classification of multirate multivariate sensor measurements that include heterogeneous time-series measurements, such as accelerations and audio recordings
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.