ObjectiveWe examined the effect of a simple Delphi-method feedback on visual identification of high frequency oscillations (HFOs) in the ripple (80–250 Hz) band, and assessed the impact of this training intervention on the interrater reliability and generalizability of HFO evaluations.MethodsWe employed a morphology detector to identify potential HFOs at two thresholds and presented them to visual reviewers to assess the probability of each epoch containing an HFO. We recruited 19 board-certified epileptologists with various levels of experience to complete a series of HFO evaluations during three sessions. A Delphi-style intervention was used to provide feedback on the performance of each reviewer relative to their peers. A delayed-intervention paradigm was used, in which reviewers received feedback either before or after the second session. ANOVAs were used to assess the effect of the intervention on the reviewers' evaluations. Generalizability theory was used to assess the interrater reliability before and after the intervention.ResultsThe intervention, regardless of when it occurred, resulted in a significant reduction in the variability between reviewers in both groups (pGroupDI = 0.037, pGroupEI = 0.003). Prior to the delayed-intervention, the group receiving the early intervention showed a significant reduction in variability (pGroupEI = 0.041), but the delayed-intervention group did not (pGroupDI = 0.414). Following the intervention, the projected number of reviewers required to achieve strong generalizability decreased from 35 to 16.SignificanceThis study shows a robust effect of a Delphi-style intervention on the interrater variability, reliability, and generalizability of HFO evaluations. The observed decreases in HFO marking discrepancies across 14 of the 15 reviewers are encouraging: they are necessarily associated with an increase in interrater reliability, and therefore with a corresponding decrease in the number of reviewers required to achieve strong generalizability. Indeed, the reliability of all reviewers following the intervention was similar to that of experienced reviewers prior to intervention. Therefore, a Delphi-style intervention could be implemented either to sufficiently train any reviewer, or to further refine the interrater reliability of experienced reviewers. In either case, a Delphi-style intervention would help facilitate the standardization of HFO evaluations and its implementation in clinical care.