Abstract
Currently, there is no domain dictionary in the field of electric vehicles disassembly and other domain dictionary construction algorithms do not accurately extract terminology from disassembly text, because the terminology is complex and variable. Herein, the construction of a domain dictionary for the disassembly of electric vehicles is a research work that has important research significance. Extracting high-quality keywords from text and categorizing them widely uses information mining, which is the basis of named entity recognition, relation extraction, knowledge questions and answers and other disassembly domain information recognition and extraction. In this paper, we propose a supervised learning dictionary construction algorithm based on multi-dimensional features that combines different features of extraction candidate keywords from the text of each scientific study. Keywords recognition is regarded as a binary classification problem using the LightGBM model to filter each keyword, and then expand the domain dictionary based on the pointwise mutual information value between keywords and its category. Here, we make use of Chinese disassembly manuals, patents and papers in order to establish a general corpus about the disassembly information and then use our model to mine the disassembly parts, disassembly tools, disassembly methods, disassembly process, and other categories of disassembly keywords. The experiment evidenced that our algorithms can significantly improve extraction and category performance better than traditional algorithms in the disassembly domain. We also investigated the performance algorithms and attempts to describe them. Our work sets a benchmark for domain dictionary construction in the field of disassembly of electric vehicles that is based on the newly developed dataset using a multi-class terminology classification.
Highlights
The rapid growth in the market for electric vehicles around the world is essential, and requires the efficient management of obsolete lithium-ion battery packs after completing their service life
Candidate keywords extraction based on multi-dimensional features, and LightGBM to classify Candidate keywords, the dictionary will automatically extend with pointwise mutual information (PMI)
Extraction accuracy of PMI conforms to the distribution of keywords in the text, which means that there are more keywords will be found by our algorithm model if try it in more text
Summary
The rapid growth in the market for electric vehicles around the world is essential, and requires the efficient management of obsolete lithium-ion battery packs after completing their service life. Robot learning from disassembled text is an important research topic that can help the disassembly of electric vehicles, currently handled manually by humans, to allow robotics to take over the task of dismantling. In order to effectively extract keywords from text and classify keywords into domain dictionary we adopt supervised learning algorithms for keyword extraction from texts, which are based on multi-dimensional features of constructed candidate keywords. The extraction of disassembly domain keywords is transformed into a machine learning binary classification problem that using disassembly domain keywords and the multidimensional features of constructed candidate keywords This method is based on the LightGBM classification model, which determines whether the candidate keyword is a keyword. The correlation between the keywords in each dictionary is measured by calculating the PMI, with the high correlation between each keyword added to the domain dictionary
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have