Abstract Introduction Holter ECG is a commonly used tool to identify heart rhythm disorders. As physician-based analysis of such Holter ECG is time consuming, various approaches using computerized interpretation have been developed. However, accuracy of these tools remains inferior to physician-based analysis. Thus, this study assesses the diagnostic performance of an artificial intelligence (AI)-powered Holter ECG service in comparison to results from reference databases as well as with physician-based evaluation. Methods 231 ECG recordings, covering 2,261 hours of ECG with annotations for 10,024,315 heartbeats from the following reference databases were used: MIT-BIH Arrhythmia (MITDB), American Heart Association ECG (AHDB), MIT-BIH Atrial Fibrillation (AFDB), Long Term Atrial Fibrillation (LTAFDB). Those recordings were originally collected from a mixed population of inpatients and outpatients between 1975 and 2008. Analysis of these recordings regarding the detection of QRS complexes, QRS classification and detection of Atrial Fibrillation (AF) was performed using a novel service for AI-based ECG analysis. Results were compared to the diagnosis stated within the database. QRS complex detection accuracy was performed on recordings from MITDB and AHADB by checking if time positions are within 150 ms to reference annotations. The QRS classification in three classes (normal, supraventricular, ventricular) was then compared for each detected heartbeat. Mismatches were adjudicated manually by three experts. Episodes of Atrial Fibrillation were detected in recordings from MITDB, AFDB and LTAFDB and compared to the reference episodes from the databases. The accuracy was evaluated based on the overlap with the reference annotations in the time-domain. Results Sensitivity and Precision of heartbeat detection compared to the reference databases exceeded 99% Figure 1 displays the results of heartbeat classification. Normal heartbeats were classified with a sensitivity of 99.6%, precision of 99.9% and specificity of 99.6%. For ventricular heartbeats, a sensitivity of 99.58 %, precision of 99.88 and specificity of 99.99% were achieved. Premature supraventricular contractions were recognized with a sensitivity of 99.58 %, precision of 100% and specificity of 100%. When comparing the 248,187 QRS complex classifications of MITDB and AHADB to the AI-powered analysis, 128 discrepancies were revealed. 43 were due to a discrepancy of convention as these beats were in sections that were also marked as AF episodes by the databases and the AI-powered analysis marks all beats in AF episodes as normal. Of the 85 remaining discrepancies, more than 87% were adjudicated in favor of the AI-powered analysis by the experts. Figure 2 shows the comparison of AF detection results showing Sensitivity and Precision above 99%. Conclusion AI-powered Holter ECG analysis compared to reference ECG stripes exceed 99% of sensitivity, specificity and precision.QRS complex detectionAF/ AFib detection