Abstract
Nowadays, as traffic accidents keep happening, traffic safety has become a major focus of contemporary social issues. Many factors account for traffic accidents, such as accident location, time period, driver’s feelings, weather and other uncertain complex factors. As a result, the occurrence of traffic accidents is nonlinear, so it is necessary to explore the correlation between the data from many different aspects so as to avoid risks. By analyzing traffic data and graphics, R language shows how the data is related. After data preprocess, data selection by using R language Remap package remapB and remapH function, we get the locations of the accidents and the accident thermal chart, where you can find high- frequency accident locations. Besides, we employ decision tree, linear regression, random forest algorithm to model the data. According to the actual results, we can verify the correctness of the model and get the most accurate model and it can help us to predict this model with similar data in the future. The ultimate goal of data analysis is to choose the most accurate model after validating the model, analyzing the characteristics of the data and the relationship between the model and the data.
Highlights
1.1 Research BackgroundAt present, China’s national economy develops rapidly
The text-type accident location in the original data is merged with the geographical coordinates of the site, that is latitude and longitude, it becomes a new data frame, which is used for the database
The forest consists of many decision trees, there is no association between each decision tree in the random forest
Summary
China’s national economy develops rapidly. Motor vehicle ownership, driving numbers, and road traffic flow continued to rise. The traffic safety problem has become a key factor which can influence lives and property’s safety of people, affecting and restricting the benefits of social and economic development. The development and progress of road traffic have brought great convenience to human society, economic benefits and social prosperity. In 2011, after the ban on drunk driving, the ownership of country car was 78 million, road traffic accidents were 210,812, the death toll as high as 62387. Comparing with Japan, the ownership of country car was more than 7,000, traffic accidents up to three times in China, while the death toll was only 4611 people. There is still a huge gap between China and developed countries in traffic safety. For the data processing requirements quickly, high traffic accident data timeliness, and let the data more accurate, we need to process time efficiently. The prediction results of each model are compared with the actual data results, and the confusion matrix is given to compare the accuracy of each model with the kappa value (the fit between the observers)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have