Abstract

With the increase in traffic accident rates, traffic risk detection is becoming increasingly important. Moreover, it is necessary to provide appropriate traffic information considering user locations and routes and design an analysis method accordingly. This paper proposes a word-embedding-based traffic document classification model for detecting emerging risks using a quantity termed sentiment similarity weight (SSW). The proposed method detects emerging risks by considering and classifying the importance and polarity of keywords in traffic document. Conventional sentiment analysis methods fail to utilize semantically significant keywords unless they are included in a sentiment dictionary. In this study, through word imputation using an established similarity dictionary and by widening the limited utilization range, the proposed method overcomes the disadvantage of sentiment dictionaries. The proposed method is evaluated through three tests. In the first, the similarity between keywords is measured, and thus model accuracy is evaluated. In the second test, three classifiers for emerging risk classification are compared. In the last test, emerging risk detection is assessed according to whether the proposed SSW is applied, and its effectiveness is therefore verified. The evaluation results demonstrate that the proposed traffic-related document classification model using the SSW has an f-measure of 0.907, indicating satisfactory performance. Therefore, the proposed SSW can be effectively used as a parameter in traffic-related document classification and enables the detection of emerging risks.

Highlights

  • The development of transportation means positively influences everyday life in several respects, such as shortening travel time and overcoming the limitations of distance travel

  • To obtain the proposed sentiment similarity weight (SSW) based on a sentiment dictionary, the model uses the polarity, term frequency–inverse document frequency (TF–IDF), and similarity values of words

  • If a word is not found in the sentiment dictionary, it is replaced by a word with the highest similarity using the similarity dictionary

Read more

Summary

INTRODUCTION

The development of transportation means positively influences everyday life in several respects, such as shortening travel time and overcoming the limitations of distance travel. Kim et al.: Word-Embedding-Based Traffic Document Classification Model for Detecting Emerging Risks Using SSW a text into quantifiable and objective information that can be analyzed. This paper proposes a word-embedding-based trafficrelated document classification model for detecting emerging risks using a quantity termed sentiment similarity weight (SSW). Seyed Mahdi Rezaeinia [24] proposed an improved word embedding method based on a pre-trained sentiment dictionary by applying sentiment analysis It extracts vectors from a text corpus according to Word2vec/GloVe, a word position algorithm, a vocabulary-based approach, and morphological analysis. WORD-EMBEDDING-BASED TRAFFIC-RELATED DOCUMENT CLASSIFICATION MODEL FOR DETECTING EMERGING RISKS USING SENTIMENT SIMILARITY WEIGHT Even though it is possible to obtain information from text data, it is difficult to recognize and predict risks based on these data. A classification model using the SSW is applied to a user interface system

DATA COLLECTION AND KEYWORD EXTRACTION USING TF–IDF
RESULTS AND PERFORMANCE
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.