Abstract

Sensitive word recognition technology is of great significance to the protection of enterprise privacy data. In electric power custom services systems, the dialogue texts recording the conversational information between electric power customers and the customer services staffs contain some sensitive information of electric power customers. However, the colloquialism and synonyms in dialogue texts often make sensitive information recognition more difficult. In this paper, we proposed an out-of-vocabulary (OOV) approach for recognizing sensitive words in the dialogue texts of electric power customer services. We combine the semantic similarity based on word embeddings and structural semantic similarity based on HowNet for recognizing sensitive OOV words in the dialogue texts. The related experiments were made, and the experimental results show that our method has higher recognition accuracy in comparison with the popular approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call