Abstract
Due to the prevalence of globalization and the surge in people’s traffic, diseases are spreading more rapidly than ever and the risks of sporadic contamination are becoming higher than before. Disease warnings continue to rely on censored data, but these warning systems have failed to cope with the speed of disease proliferation. Due to the risks associated with the problem, there have been many studies on disease outbreak surveillance systems, but existing systems have limitations in monitoring disease-related topics and internationalization. With the advent of online news, social media and search engines, social and web data contain rich unexplored data that can be leveraged to provide accurate, timely disease activities and risks. In this study, we develop an infectious disease surveillance system for extracting information related to emerging diseases from a variety of Internet-sourced data. We also propose an effective deep learning-based data filtering and ranking algorithm. This system provides nation-specific disease outbreak information, disease-related topic ranking, a number of reports per district and disease through various visualization techniques such as a map, graph, chart, correlation and coefficient, and word cloud. Our system provides an automated web-based service, and it is free for all users and live in operation.
Highlights
Humans have suffered from various infectious diseases such as COVID-19, SARS, Ebola, and MERS
We propose and implement an effective deep-learning-based ranking algorithm referred to as WBiLSTM-Term Frequency-Inverse Document Frequency (TF-IDF), which extracts a list of important words from Internet-sourced data
The first interface presents data based on Bing news, the second interface displays data based on tweets, the third interface provides data based on Google search, and the fourth interface does a comparison of the three types of data
Summary
Humans have suffered from various infectious diseases such as COVID-19, SARS, Ebola, and MERS. Even with the remarkable development of medicine and significant leaps in vaccine development, it is becoming almost impossible to treat all newly detected diseases, and the intuitive solution seems to be the prevention and containment of epidemic diseases through effective earlier warnings. Such prevention can be achieved by leveraging all available data, structured and formal and unstructured and noisy, to build robust systems that can help prevent the spread of diseases through accurate monitoring and public awareness. CDCs surveil disease outbreaks, but they require a lead time of more than one week to collect and produce disease outbreak statistics, which makes it difficult to respond instantly to new disease outbreaks
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have