Abstract
Language models are transforming materials-aware natural-language processing by enabling the extraction of dynamic, context-rich information from unstructured text, thus, moving beyond the limitations of traditional information-extraction methods. Moreover, small language models are on the rise because some of them can perform better than large language models (LLMs) when given domain-specific question-answer tasks, especially about an application area that relies on a highly specialized vernacular, such as materials science. We therefore present a new class of MechBERT language models for understanding mechanical stress and strain in materials. These employ Bidirectional Encoder Representations for transformer (BERT) architectures. We showcase four MechBERT models, all of which were pretrained on a corpus of documents that are textually rich in chemicals and their stress-strain properties and were fine-tuned on question-answering tasks. We evaluated the level of performance of our models on domain-specific as well as general English-language question-answer tasks and also explored the influence of the size and type of BERT architectures on model performance. We find that our MechBERT models outperform BERT-based models of the same size and maintain relevancy better than much larger BERT-based models when tasked with domain-specific question-answering tasks within the stress-strain engineering sector. These small language models also enable much faster processing and require a much smaller fraction of data to pretrain them, affording them greater operational efficiency and energy sustainability than LLMs.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have