Abstract

Abstract The growth in the number of books leads to the increasing inefficiency and cost of traditional library data management methods for book verification and classification. To solve the problems of book classification, this paper proposes an automatic classification model, ERBERT-HMATT, for book data features. Firstly, the pre-training of BERT is improved by adding masking at the word and entity level. Then, the network structure of the model is designed based on HMCN. Finally, a multi-label attention mechanism is introduced in the initial feature extraction module, which gives different weights to the words of the input text to increase the attention to the text features, and the model is subjected to recurrent learning, which enhances the robustness of the model by adding fine-grained knowledge. Finally, the classification performance of the three algorithms, KNN, SVM, and ERBERT-HMATT, is tested on the same dataset. The accuracy of the ERBERT-HMATT classification algorithm is 0.1% higher than that of the KNN according to the results. Classifying 300 book information takes less than 100ms, which is significantly less than the processing time for SVM and KNN. This paper also found that the subject word field in the book information has a large positive effect on the model classification, which can improve the classification accuracy by 0.09 compared with the model with only the title field. The test results indicate that the method enhances the classification of book data to a certain extent.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call