Abstract
Purpose To solve the problems of annotation noise, ambiguity recognition and nested entity recognition in the field of Chinese furniture, this paper aims to design a new recognition model ALE-BiLSTM-CRF. Design/methodology/approach This paper addresses the relative independence of text characters in the Chinese furniture domain named entity recognition (NER) task. It also considers the limited information provided by these text characters in this task. Therefore, a model named ALE-BiLSTM-CRF for Chinese furniture domain NER is proposed. First, the ERNIE pre-trained model is used to transform text into a dynamic vector that integrates contextual information. And adversarial learning is combined to generate adversarial samples to enhance the robustness of the model. Next, the BiLSTM module captures the temporal information of the context, and the multi-head attention mechanism integrates long-distance semantic features into the character vectors. Finally, a CRF layer is used to learn the constraints between labels, enabling the model to generate more reasonable and semantically consistent label sequences. This paper conducts comparative experiments with mainstream models on the Weibo data set, achieving an F1 score of 75.52%, demonstrating its generality and robustness. Additionally, comparative and ablation experiments are conducted on a self-constructed furniture data set in the Chinese furniture field, achieving an F1 score of 89.62%, verifying the model’s superiority and effectiveness. Findings This paper conducts comparative experiments with mainstream models on the Weibo data set, achieving an F1 score of 75.52%, demonstrating its generality and robustness. Additionally, comparative and ablation experiments are conducted on a self-constructed furniture data set in the Chinese furniture field, achieving an F1 score of 89.62%, verifying the model’s superiority and effectiveness. Research limitations/implications This paper demonstrates its universality and generalization by conducting comparative experiments with mainstream models on the Weibo data set. It also conducts comparative experiments with representative pre-trained models on the furniture data set and conducts ablation experiments on the model itself, further demonstrating the superiority and effectiveness of the model. Practical implications In the furniture domain, NER aims to use various methods, including rule templates, machine learning and deep learning techniques, to extract structured information related to furniture from unstructured text. These pieces of information may include the name, material, brand, style and function of the furniture. By extracting and identifying these named entities, this paper can provide more accurate data support for furniture design, manufacturing and marketing, thereby promoting further development and innovation in the furniture industry. Social implications In the furniture field, NER faces some special challenges, which are different from entity recognition in general fields. Furniture terminology is often highly specialized and complex in structure. At the same time, there may be a large number of nested entities in the text of the furniture field. For example, the furniture name “sofa bed” contains two entities “sofa” and “bed.” Current sequence labeling methods often find it difficult to recognize such nested entity structures simultaneously. Additionally, because furniture terminology and descriptions may change with trends and design styles, the model also needs to have a certain degree of adaptability and update capabilities. These reasons make it more difficult to extract information in the furniture field, and NER in the furniture field faces huge challenges. Originality/value This paper conducts comparative experiments with mainstream models on the Weibo data set, achieving an F1 score of 75.52%, demonstrating its generality and robustness. Additionally, comparative and ablation experiments are conducted on a self-constructed furniture data set in the Chinese furniture field, achieving an F1 score of 89.62%, verifying the model’s superiority and effectiveness.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.