Background: The National Institutes of Health Stroke Scale (NIHSS) is widely used to measure stroke deficits and is deemed to be reliable when used by a range of professionals. Aims: This study aimed to establish the inter-rater reliability of the NIHSS when completed via telemedicine. Secondary aims were to explore if professional group, length of time since training and/or re-certification, frequency of use and reason for using the NIHSS influenced the inter-rater reliability. Methods: A total of 30 video clips, the equivalent of two whole patient assessments for each of the 15 NIHSS items, were analysed by a range of NIHSS-certified clinical participants. Of these, 10 were nurses and five were consultants. Kappa statistics were used to calculate the inter-rater reliability for each item, with additional data on the range of agreement of items. Data across group characteristics were compared to test hypotheses about factors that could impact reliability. Findings: Overall, the inter-rater reliability was found to be lower than anticipated, and there was a wide variation in ratings. Consultants tended to score better than nurses, and, counter-intuitively, stroke specialist staff and those who used the NIHSS more frequently tended to have poorer reliability than their counterparts. Total agreement on score was only achieved in five out of the 30 video clips (16.6%), with agreement better at either end of the scoring range (i.e. no deficit or worst deficit). These findings indicate that reliability of the NIHSS may be lower than anticipated. Conclusion: Further research is needed to better understand the poor reliability of the NIHSS, as this has implications for care decisions and patient outcomes.
Read full abstract