Abstract

Affective analysis of social media text is in great demand. Online text written in Chinese communities often contains mixed scripts including major text written in Chinese, an ideograph-based writing system, and minor text using Latin letters, an alphabet-based writing system. This phenomenon is referred to as writing systems changes (WSCs). Past studies have shown that WSCs often reflect unfiltered immediate affections. However, the use of WSCs poses more challenges in Natural Language Processing tasks because WSCs can break the syntax of the major text. In this work, we present our work to use WSCs as an effective feature in a hybrid deep learning model with attention network. The WSCs scripts are first identified by their encoding range. Then, the document representation of the text is learned through a Long Short-Term Memory model and the minor text is learned by a separate Convolution Neural Network model. To further highlight the WSCs components, an attention mechanism is adopted to re-weight the feature vector before the classification layer. Experiments show that the proposed hybrid deep learning method which better incorporates WSCs features can further improve performance compared to the state-of-the-art classification models. The experimental result indicates that WSCs can serve as effective information in affective analysis of the social media text.

Highlights

  • In social media, text is becoming increasingly important due to its effectiveness in disseminating information in highly individualized and opinionated context

  • This paper presents a hybrid deep learning model with attention network for affective analysis in the context of writing system changes

  • We argue that Writing Systems Changes (WSCs) text is potentially informative and a proper learning model needs to be designed such that additional information can be captured in deep learning based models for emotion classification

Read more

Summary

Introduction

Text is becoming increasingly important due to its effectiveness in disseminating information in highly individualized and opinionated context. The minor text can be written in English (as shown in E1), Pinyin (phonetic denotation for Chinese) (as shown in E2 in short form), or other new Internet notations with Roman characters using some Latin-based writing system as well as other symbolic expressions, e.g. emoji symbols as shown in E3. This phenomenon of using mixed scripts in different writing systems is known as Writing Systems Changes (WSCs). The alternation between different writing systems is relatively common in real-time platforms like micro-blog in China This feature offers reliable clues for affective analysis

Definition of WSCs
Types of WSCs in Chinese
Our approach
Related work
Hybrid neural model with attention network
Task definition
WSC identification
The hybrid neural network structure
Objective functions
Performance evaluation
Datasets
Baseline systems and performance measures
Affective analysis
Writing system investigation
Parameter tuning
Visualization and case study
Wuli super junior
Findings
Conclusion and future work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.