Abstract

Cluster analysis is often employed as the initial stage in organizing heterogeneous data into homogeneous groups. Choosing an effective clustering approach and an ideal number of clusters in a traffic accident dataset might be complex and challenging. This study aims to evaluate the effectiveness of k-means and two-step cluster methods. Subsequently, the two-step cluster method and GIS are applied to analyze the traffic accident datasets from 2015 to 2017 in Hanoi, Vietnam. First, according to the Silhouette score, the two-step cluster method achieved a higher score of 0.563, while the k-means method scored 0.341. A higher Silhouette score indicates more well-defined clusters. Second, the research suggests combining the two-step cluster method with GIS for analyzing traffic accident datasets. The outcome identifies five typical types of accidents in Hanoi. In addition, the locations of various accident types were visually illustrated on a map, enabling traffic officials to recommend precise and urgent countermeasures. Importantly, the clustering results reveal that the two-step cluster method exhibits a significantly higher rate of homogeneous data in the clusters compared to the k-means method. This study demonstrates that the two-step cluster method is not only more effective than the k-means method in terms of clustering ability but also in data pre-processing. The study's results enable authorities to gain a more detailed understanding of typical traffic accident patterns in Hanoi. Besides, the employed methods could potentially be applied to other regions, providing an additional avenue for analysis

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.