Traffic violations have been increasing each year. According to data from the Padang City Police from 2018 to 2023, there were 128,913 traffic violation cases. This is not a small number, and it is time for the police to start utilizing machine learning (ML) technology to evaluate traffic violation cases, as ML can identify hidden patterns or information that cannot be detected manually by conventional statistics or by traffic officers. This research aims to classify traffic violations using the Naïve Bayes algorithm at the Padang City Police by conducting evaluations and comparisons using different dataset ratios. The best algorithm obtained from the comparison will then be analyzed, and the research findings are expected to serve as a reference for the relevant authorities. This research is quantitative in nature, using an experimental method. The data sources or information were obtained from traffic ticket documentation at the Padang City Police and questionnaires distributed to traffic officers of the Padang City Police. The research results show that the Naïve Bayes (NB) algorithm can be used to classify traffic violations at the Padang City Police. The performance test results of the Naïve Bayes (NB) algorithm using all comparison algorithms with different training and testing dataset ratios resulted in 100% accuracy. However, during cross-validation, the Naïve Bayes algorithm achieved the highest accuracy only with training and testing dataset ratios of 80%:20% and 90%:10%. This is due to the large dataset size in this research, which is more than 100,000 entries. The evaluation results of the Naïve Bayes algorithm show that the best model is achieved with the Naïve Bayes algorithm using an 80% training and 20% testing dataset split. Although the performance is similarly high with a 90%:10% training and testing ratio, the researcher chose the 80%:20% training and testing ratio as the best algorithm for reasons of efficiency during training. The argument is that even with just 80%, it is able to predict/classify 20%, which is more efficient than training 90% to predict/classify 10%. Another finding from this implementation is that with a large dataset of 100,000 entries or more, high and stable performance can be achieved, so this research also suggests that to achieve good results from traffic violation classification, the dataset should be above 100,000 entries.