Abstract
This paper explains the methodology for an object classification system using audio features for the purpose of integrating the audio classifier into a real time visual object tracking system in order to more accurately track and describe objects of interest. Four objects are classified by the sounds they produce using Mel-Frequency Cepstral Coefficients (MFCC). These features are classified using a Dynamic Time Warping (DTW) approach along with a k Nearest Neighbor (kNN) classifier. In particular, this paper improves upon the best method of a survey [2] that uses MFCC and DTW. In this paper we propose a method that builds on using only MFCC and DTW. We suggest that once the costs from MFCC and DTW are computed they be used as feature vectors to be classified by another classification method. The results show a 24% improvement over using only MFCC with DTW. These results prove the usefulness of the joint classification system which can be integrated into a multiple robot system.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have