Abstract
Background Internet-based big data may offer important and timely information concerning road traffic injury data, supplementing official government statistics. We developed computer-based approaches to define, extract and automatically collect internet-based Chinese language big data on road traffic injuries. Methods Based on injury prevention matrices and ICD-10, we established a thesaurus set and analysis framework for data extraction. A dilated convolutions neural network classifier was developed to filter eligible news stories based on 10,000 researcher-annotated news sources, and algorithms were built to extract information concerning relevant variables. Word frequency was reported using a Python Chinese word segmentation module (Jieba). Pearson correlation coefficients examined relations between internet-based big data and official statistics. Results 650,140 media reports were captured from 27 Chinese news websites, and 92,813 news pieces were filtered as eligible reports (accuracy=86%). Searches captured information about 71,829 traffic crashes from January 2013-September 2019. The words ‘crash’, ‘vehicle’ and ‘scene’ were the most frequently used words in the stories. Our results revealed characteristics that official statistics did not cover, such as changes in travel patterns for the elderly. The number of media-reported crashes was highly correlated with official statistics (r=0.84, p=0.035). Conclusion Internet-based big data offers information about traffic crashes that can supplement official government statistics and aid in road traffic injury prevention strategies. Extension to countries where government data and statistics are unreliable, but news reporting is reliable, appeals in particular. Learning Outcomes Internet-based big data offers data that can supplement existing road traffic injury sources and guide prevention efforts.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.