Urban traffic accidents pose significant challenges to the sustainability of transportation infrastructure not only in Vietnam but also all over the world. To decrease the frequency of accidents, it is crucial to analyze accident data to determine the relationship between accidents and causes, especially for serious accidents. This study suggests an integrated approach using Geographic Information System (GIS) and Data Mining methods to investigate the features of urban traffic accidents in Hanoi, Vietnam aiming to solve these challenges and enhance the safety and efficiency of urban transportation. Firstly, the dataset was segmented into homogenous clusters using the two-step cluster method. Secondly, the correlation between causes and traffic accidents was examined on the overall dataset as well as on each cluster using the association rule mining (ARM) technique. Finally, the location of accident groups and high-frequency sites of accidents (hotspots) were determined by using GIS techniques. As a result, a five-cluster model was created, which corresponded to five common accident groupings in Hanoi. Moreover, the results of the study also identified the types of accidents, the main causes, the time, and the surrounding areas corresponding to each accident group. In detail, cluster 5 depicted accidents on streets, provincial, and national roads caused by motorbikes making up the highest percentage within the groups, accounting for 29.2%. Speeding and driving in the wrong lane in the afternoon and at night were the main causes in this cluster (Cf ≥ 0.9 and Lt ≥ 1.22). Next, cluster 2 had the second-highest proportion. Cluster 2 presented accidents between a truck/car and a motorbike on national and provincial roads, accounting for 27.8%. Cluster 1 presented accidents between a truck/car and a motorbike on local streets, accounting for 22%. Cluster 3 illustrated accidents between two motorbikes on the country lanes, accounting for 12.3%. Finally, cluster 4 depicted single-vehicle motorbike crashes, with the lowest rate of 8.8%. More importantly, this study also recommended using repeatability criteria for the same type of accidents or causes to determine the location of hotspots. Also, suggestions for improving traffic infrastructure sustainability were proposed. To our knowledge, this is the first time in which these three methods are applied simultaneously for analyzing traffic accidents.
Read full abstract