Abstract
Classification of normal vs. pathological infant cry is an interesting and technologically challenging research problem due to quasi-periodic sampling of vocal tract spectrum by high pitch-source harmonics resulting in extremely poor spectral resolution for commonly used spectral features, such as Mel Frequency Cepstral Coefficients (MFCC). To that effect, in this paper, we propose a new approach of feature extraction based on Constant Q Transform (CQT) that is known to have variable spectro-temporal resolution w.r.t Heisenberg’s un-certainty principle in signal processing framework. Further, CQT is also known to preserve form-invariance property (than its Short-Time Fourier Transform (STFT) counterpart)-a desirable attribute of feature descriptors to be invariant w.r.t shape, shift, rotation, and scaling. CQT- based features are then transformed to the cepstral-domain to derive Constant Q Cepstral Coefficients (CQCC), which are then fed to statistical and discriminative classifiers, namely, Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) respectively. CQCC-GMM and CQCC-SVM systems gave relatively better results than MFCC for various experimental evaluation factors for infant cry classification task on widely used and statistically meaningful Baby Chilanto Database. Relatively best performance, in particular, 99.82% accuracy (0.44% EER), is observed for CQCC-GMM system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.