Assessing Scientific Text Similarity: A Novel Approach Utilizing Non-Negative Matrix Factorization and Bidirectional Encoder Representations from Transformer

Zhixuan Jia,Wenfang Tian,Wang Li,Kai Song,Fuxin Wang,Congjing Ran

doi:10.3390/math12213328

Abstract

The patent serves as a vital component of scientific text, and over time, escalating competition has generated a substantial demand for patent analysis encompassing areas such as company strategy and legal services, necessitating fast, accurate, and easily applicable similarity estimators. At present, conducting natural language processing(NLP) on patent content, including titles, abstracts, etc., can serve as an effective method for estimating similarity. However, the traditional NLP approach has some disadvantages, such as the requirement for a huge amount of labeled data and poor explanation of deep-learning-based model internals, exacerbated by the high compression of patent content. On the other hand, most knowledge-based deep learning models require a vast amount of additional analysis results as training variables in similarity estimation, which are limited due to human participation in the analysis part. Thus, in this research, addressing these challenges, we introduce a novel estimator to enhance the transparency of similarity estimation. This approach integrates a patent’s content with international patent classification (IPC), leveraging bidirectional encoder representations from transformers (BERT), and non-negative matrix factorization (NMF). By integrating these techniques, we aim to improve knowledge discovery transparency in NLP across various IPC dimensions and incorporate more background knowledge into context similarity estimation. The experimental results demonstrate that our model is reliable, explainable, highly accurate, and practically usable.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Assessing Scientific Text Similarity: A Novel Approach Utilizing Non-Negative Matrix Factorization and Bidirectional Encoder Representations from Transformer

Abstract

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Journal: Mathematics	Publication Date: Oct 23, 2024
License type: CC BY 4.0

Similar Papers

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Bert model fine-tuning for text classification in knee OA radiology reports
L Chen ... V Pedoia
Osteoarthritis and Cartilage | VOL. 28
L Chen, et. al.L Chen ... V Pedoia
01 Apr 2020
Osteoarthritis and Cartilage | VOL. 28

Classification of Fire Related Tweets on Twitter Using Bidirectional Encoder Representations from Transformers (BERT)
Jairus Mingua ... Dionis Padilla
-
Jairus Mingua, et. al.Jairus Mingua ... Dionis Padilla
28 Nov 2021
28 Nov 2021

Natural Language Processing with “More Than Words – BERT”
Saranlita Chotirat ... Phayung Meesad
-
Saranlita Chotirat, et. al.Saranlita Chotirat ... Phayung Meesad
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing Scientific Text Similarity: A Novel Approach Utilizing Non-Negative Matrix Factorization and Bidirectional Encoder Representations from Transformer

Abstract

Talk to us

Similar Papers

More From: Mathematics