Despite availability of commercial EEG software for automated epileptiform detection, validation on real-world EEG datasets is lacking. Performance evaluation of two software packages on a large EEG dataset of patients with genetic generalized epilepsy was performed. Three epileptologists labelled IEDs manually of EEGs from three centres. All Interictal epileptiform discharge (IED) markings predicted by two commercial software (Encevis 1.11 and Persyst 14) were reviewed individually to assess for suspicious missed markings and were integrated into the reference standard if overlooked during manual annotation during a second phase. Sensitivity, precision, specificity, and F1-score were used to assess the performance of the software packages against the adjusted reference standard. One hundred and twenty-five routine scalp EEG recordings from different subjects were included (total recording time, 310.7 hours). The total epileptiform discharge reference count was 5,907 (including spikes and fragments). Encevis demonstrated a mean sensitivity for detection of IEDs of 0.46 (SD 0.32), mean precision of 0.37 (SD 0.31), and mean F1-score of 0.43 (SD 0.23). Using the default medium setting, the sensitivity of Persyst was 0.67 (SD 0.31), with a precision of 0.49 (SD 0.33) and F1-score of 0.51 (SD 0.25). Mean specificity representing non-IED window identification and classification was 0.973 (SD 0.08) for Encevis and 0.968 (SD 0.07) for Persyst. Automated software shows a high degree of specificity for detection of nonepileptiform background. Sensitivity and precision for IED detection is lower, but may be acceptable for initial screening in the clinical and research setting. Clinical caution and continuous expert human oversight are recommended with all EEG recordings before a diagnostic interpretation is provided based on the output of the software.
Read full abstract