Abstract

Lesions in the brain resulting from traumatic injuries or strokes can evolve into speech dysfunction in undiagnosed patients. Employing ML-based tools to analyze the prosody or articulatory phonetics of human speech could be advantageous for early screening of undetected brain injuries. Additionally, explaining the model’s decision-making process can support predictions and take appropriate measures to improve patient voice quality. However, traditional ML methods relying on low-level descriptors (LLDs) may sacrifice detailed temporal dynamics and other speech characteristics. Interpreting these descriptors can also be challenging, requiring significant effort to understand feature relationships and suitable ranges. To address these limitations, this research paper introduces xDMFCCs, a method that identifies interpretive discriminatory acoustic biomarkers from a single speech utterance, providing local and global interpretations of deep learning models in speech applications. To validate this approach, it was implemented to interpret a Convolutional Neural Network (CNN) trained on Mel-frequency Cepstral Coefficients (MFCC) for the binary classification task to differentiate between patients from control vocalizations. The ConvNet achieved promising results with a 75% f-score (75% recall, 76% precision), comparable to conventional machine learning baselines. What sets xDMFCCs apart is its explanation through a 2D time–frequency representation that preserves the complete speech signal. This representation offers a more transparent explanation for differentiating between patients and healthy controls, enhancing interpretability. This advancement enables more detailed and compelling studies in speech acoustic traits of brain lesions. Furthermore, the findings have significant implications for developing low-cost and rapid diagnostics of unnoticed brain lesions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.