Abstract

In the field of customer service, speech and text records generated from users often contain a wealth of product information. Identifying users' intent, mining hot issues, and analyzing relationship of users' needs from massive data are basic tasks to improve the intelligent level of customer service operation and maintenance, while an annotated question corpus is prerequisite for training machines to understand information needs of users. Taking the offline bidding tool service item in the E-commerce platform of the State Grid ICT system as an example, compared to the annotation with one single label, this paper develops a multi-level and multi-label question category annotation strategy based on the ICT system function module, and forms a corresponding annotated corpus. Using the schedule, 700 customer service speech records about the offline bidding tool were annotated with a total number of 911 questions, covering 68 question types. The annotation has obtained appropriate inter-annotator agreement to ensure corpus quality. Furthermore, the distribution and relationship of the annotated labels are measured by descriptive statistics and social network map.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.