User portrait has become a research hot spot in the field of knowledge graph in recent years and the rationality of tag extraction directly affects the quality of user portrait. However, most of the current tag extraction methods for portraits only consider the methods based on word frequency statistics and semantic clustering. These methods have some drawbacks: they cannot effectively discover the preferred themes of the enterprise, dynamically update the portrait tags, and adapt to the needs of the enterprise. In this paper, we propose an enterprise adaptive tag extraction method based on multi-feature dynamic portrait (ATEMDP). ATEMDP first uses K-means to measure the similarity between enterprise texts in preference division, and converts similar enterprise text clustering problems into tag feature clusters to obtain the point cluster structure containing the distribution of tag preference topics. In addition, in the multi-feature selection, the professional domain thesaurus is introduced for feature expansion, and the topic text is introduced into the Bert model as a sample set to discover the potential features of the enterprise text. In the end, in dynamic tag extraction, BiLSTM and CNN are used to extract features, and dynamic preference tags are obtained by updating enterprise text. THUCNews data set and Ente-pku data set are used for simulation, and seven other methods are considered in comparison. The experimental results indicate that ATEMDP is not only superior to other conventional methods in accuracy and F1-score, but also effectively solves the dynamic tagging problem of enterprise portrait.
Read full abstract