Abstract
Due to their unique combination of metallic- and ceramic-like properties, MAX phases have attracted a lot of attentions. By selectively etching A-site atoms, MXenes with unique two-dimensional structures can be potentially generated. Due to their extraordinary properties, MXenes have currently made their way to the forefront of various research areas including electronics, photonics and catalysis. Therefore, the development of novel synthesis strategies for MAX/MXene is a key issue for the further development of MAX/MXene. Distilling insights from scientific literatures could accelerate the exploration of novel synthesis recipes; however, manually extracting scattered information from thousands of journal articles is laborious. In this study, we present an annotated corpus incorporating domain knowledge about MAX/MXene synthesis processes, deriving from experimental sections within 110 papers on MAX/MXene research; and based on that, a baseline model (including named entity recognition (NER) and relation extraction (RE) parts) is proposed for distilling information about MAX/MXene synthesis conditions from literatures using pre-trained natural language processing (NLP) models. We also demonstrate the efficacy of the proposed pipeline owning to the joint effort of domain knowledge (about MAX/MXene) and machine learning; where the entity recognition model possessing optimized setting could detect the entities with F1 score of 0.8452, and for relation extraction model with F1 score of 0.8476. It is hoped that the current work would provide an auxiliary for the future research and development of novel MAX/MXenes. In addition, the developed model could serve as a pre-trained model of MAX/MXenes synthesis routes extraction for future data augment.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have