Abstract Background Ventricular arrhythmias (VA) risk stratification tools in individuals without known cardiovascular disease are still insufficient. Previous studies have used convolutional neural networks (CNN) to classify VA by the end of an established follow-up period using the electrocardiogram (ECG), showing a moderate performance (area under the curve [AUC] of 0.610). We hypothesised that the appearance of the VA substrate on the ECG signal depends on the proximity of the ECG acquisition to the VA event. Thus, the performance could improve if the CNN was trained to predict time-to-VA event instead, and if the presence of censored data was considered. Purpose We developed and tested a CNN to predict time-to-VA event in middle-aged individuals without known cardiovascular disease, and we compared its performance with that from ECG indices, such as QRS duration, QT interval and T-wave morphology variations (TMV), as well as sex and age. Methods A total of 69,283 individuals without known cardiovascular disease from the UK Biobank were used as the training set. We applied 10-fold cross validation to train a CNN, where the input was a 15-second ECG at rest (lead I), and the output was ‘occurrence of VA’, ‘censored’, or ‘survival’, up to 1, 2, 3, 4, 6, 8, 10 and 12 years after the ECG acquisition. We used 15-second ECG (lead I) from an independent cohort of 17,320 individuals without cardiovascular disease also from UK Biobank as a test set, where we set a follow-up of 10 years. The above ECG indices were derived from the median heartbeat from the ECG using in-house algorithms. We derived optimal cut-off thresholds from the training test, and we evaluated the AUC, sensitivity, specificity and Cox regression hazard ratios (HR) in the test set. Results In the test set, 9,206 individuals (53%) reached the 10-year follow-up, during which 60 (0.7%) had a VA event. The AUC, sensitivity and specificity values of the CNN were 0.715, 0.881 and 0.400, respectively. These values were 0.592, 0.686 and 0.383 for the QRS duration, 0.569, 0.156 and 0.717 for the QT interval and 0.552, 0.712 and 0.350 for TMV. The combination of sex and age alone showed performance values of 0.760, 0.694 and 0.683, respectively. Survival analyses showed a HR of 3.883 (P = 3.0 x 10-7) for individuals with a CNN prediction greater than the threshold derived in the training set after adjusting for age and gender. Conclusions A CNN that is trained to predict time-to-VA event and accounts for censored data outperforms previous classification proposed models and the predictive value of individual ECG indices of risk. Although the predictive value of the CNN does not show a better long-term VA predictive value than sex and age alone, it provides a significant contribution independently from these two risk factors. Our findings support the use of the ECG to improve long-term VA risk stratification, which might be particularly useful in age- and sex-stratified analyses.