Abstract
We propose a robust and efficient lung sound classification system using a snapshot ensemble of convolutional neural networks (CNNs). A robust CNN architecture is used to extract high-level features from log mel spectrograms. The CNN architecture is trained on a cosine cycle learning rate schedule. Capturing the best model of each training cycle allows to obtain multiple models settled on various local optima from cycle to cycle at the cost of training a single mode. Therefore, the snapshot ensemble boosts performance of the proposed system while keeping the drawback of expensive training of ensembles moderate. To deal with the class-imbalance of the dataset, temporal stretching and vocal tract length perturbation (VTLP) for data augmentation and the focal loss objective are used. Empirically, our system outperforms state-of-the-art systems for the prediction task of four classes (normal, crackles, wheezes, and both crackles and wheezes) and two classes (normal and abnormal (i.e. crackles, wheezes, and both crackles and wheezes)) and achieves 78.4% and 83.7% ICBHI specific micro-averaged accuracy, respectively. The average accuracy is repeated on ten random splittings of 80% training and 20% testing data using the ICBHI 2017 dataset of respiratory cycles.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.