Abstract

Deep learning and computer vision technology are attracting great attention in enhancing stroke prognoses by enabling an earlier initial risk-assessment, which can be done through identifying key signs of stroke: facial droop and slurred speech. Accessibility to a mobile device camera and microphone will vastly improve recognition capability for strokes outside the hospital, decreasing transportation time between the home and stroke care center. This plays an important role to reduce preventable death and disability. In this research, a deep learning model was trained by fine-tuning a convolutional neural network (CNN), ResNet, on a preliminary dataset of publicly available portrait photographs and audio recordings. A ResNet 50 layers deep realized a cross-validation accuracy of 85%, f1-score of 0.84; and receiver operating characteristic-area under the curve of 0.97 was achieved in classifying facial droop. For slurred speech, a validation accuracy of 96%, f1-score of 0.97, and receiver operating characteristic-area under the curve of 0.98 were obtained with a ResNet 101 layers deep. The results indicate that a fine-tuned ResNet can effectively distinguish between the faces and speech of stroke and non-stroke individuals.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call