Awake craniotomy presents a unique opportunity to map and preserve critical brain functions, particularly speech, during tumor resection. The ability to accurately assess linguistic functions in real-time not only enhances surgical precision, but also contributes significantly to improving postoperative outcomes. However, today, its evaluation is subjective as it relies on a clinician's observations only. This paper explores the use of a deep learning based model for the objective assessment of speech arrest and speech impairments during awake craniotomy. We extracted 1883 3-second audio clips containing the patient's response following direct electrical stimulation from 23 awake craniotomies recorded from two operating rooms of the Tokyo Women's Medical University Hospital (Japan) and two awake craniotomies recorded from the University Hospital of Brest (France). A Wav2Vec2-based model has been trained and used to detect speech arrests and speech impairments. Experiments were performed with different datasets settings and preprocessing techniques and the performances of the model were evaluated using the F1-score. The F1-score was 84.12% when the model was trained and tested on Japanese data only. In a cross-language situation, the F1-score was 74.68% when the model was trained on Japanese data and tested on French data. The results are encouraging even in a cross-language situation but further evaluation is required. The integration of preprocessing techniques, in particular noise reduction, improved the results significantly.
Read full abstract