Literature Classification and its Applications in Condensed Matter Physics and Materials Science by Natural Language Process

Quansheng 泉生 Wu 吴,Sijia 思佳 Tu 涂,Siyuan 思远 Wu 吴,Hong 泓 Li 李,Siyuan Wu,Siyuan Wu,Siyuan Wu,Ruijuan Xiao,Hong Li,Tiannian Zhu,Jie Yuan,Hongming Weng,Sijia Tu,Ruijuan Xiao,Tiannian Zhu,Quansheng Wu,Tiannian Zhu,Quansheng Wu,Jie Yuan,Sijia Tu,Hongming Weng

doi:10.1088/1674-1056/ad3c30

Abstract

Abstract The exponential growth of literature is constraining researchers’ access to comprehensive information in related fields. While natural language processing (NLP) may offer an effective solution to literature classification, it remains hindered by the lack of labelled dataset. In this article, we introduce a novel method for generating literature classification models through semi-supervised learning, which can generate labelled dataset iteratively with limited human input. We apply this method to train NLP models for classifying literatures related with several research directions, namely battery, superconductor, topological material, and artificial intelligence (AI) in materials science. The trained NLP ‘battery’ model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738, which indicates the accuracy and reliability of this scheme. Furthermore, our approach demonstrates that even with insufficient data, the not-well-trained model at first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Literature Classification and its Applications in Condensed Matter Physics and Materials Science by Natural Language Process

Abstract

Talk to us

Similar Papers

More From: Chinese Physics B

Lead the way for us

Similar Papers

Recent Advances in Topological Quantum Materials by Angle-Resolved Photoemission Spectroscopy
Yujie Chen ... Yulin Chen
Matter | VOL. 3
Yujie Chen, et. al.Yujie Chen ... Yulin Chen
30 Jul 2020
Matter | VOL. 3

Predictive modeling in reproductive medicine: Where will the future of artificial intelligence research take us?
Carol Lynn Curchoe ... Zev Rosenwaks
Fertility and Sterility | VOL. 114
Carol Lynn Curchoe, et. al.Carol Lynn Curchoe ... Zev Rosenwaks
01 Nov 2020
Fertility and Sterility | VOL. 114

Natural Language Processing and Computational Linguistics
Junichi Tsujii
Computational Linguistics | VOL. -
Junichi TsujiiJunichi Tsujii
07 Dec 2021
Computational Linguistics | VOL. -

Natural language processing of admission notes to predict severe maternal morbidity during the delivery encounter
Mark A Clapp ... Thomas H Mccoy
American Journal of Obstetrics and Gynecology | VOL. 227
Mark A Clapp, et. al.Mark A Clapp ... Thomas H Mccoy
14 Apr 2022
American Journal of Obstetrics and Gynecology | VOL. 227

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Literature Classification and its Applications in Condensed Matter Physics and Materials Science by Natural Language Process

Abstract

Talk to us

Similar Papers

More From: Chinese Physics B