The article examines the extraction of information from texts using the ontology of a subject area combined with neu-ral network-based text analysis methods, including the use of large language models. It discusses the expert's role in developing and maintaining systems, illustrated through the task of extracting information from analytical articles and constructing ontologies in computational linguistics to represent key concepts relevant to the system's user or customer. The process of ontology creation is accompanied by the development of a dictionary that forms the ontology's termi-nological core, followed by methods for extracting new terms within the specified subject area. This task is considered as a named entity recognition problem, traditionally addressed by training a neural network model on a representative dataset. The study compares this approach with a methodology leveraging large language models. For this, lexical and syntactic patterns, as well as instruction patterns for hypothesis testing regarding new term-phrases and result verifica-tion, were developed. The developed instructions for solving the problem of relation extraction also include the auto-mated generation of natural language competency assessment questions for each ontology relation. The novelty of the proposed approach lies in the integration of ontological, linguistic and neural network approaches to extract infor-mation from texts. The study demonstrates the possibility of solving tasks of text analysis and information extraction problems through a chain of large language models, with dynamically generated instructions based on the outcomes of prior analysis stages. The following F1-measure scores were achieved in the experiments: F1=0.8 for term extraction and classification and F1=0.87 for relation extraction.
Read full abstract