Abstract

Currently, the processing of search queries in big data systems is an important area of research. Its results find applications in various fields, including research, development and technological work (R&D). One of the main tasks in this area is accounting, analysis and promoting its participants through competitive means. To achieve this, information and analytical scientometric systems are developed to aggregate published R&D results. The article discusses a specific task arising in such systems, namely, the task of determining the involvement of authors in writing a scientific publication. Information and analytical systems store records of publications and their authors, but often there are no mechanisms that allow determining the relationship between the publication and the authors with high accuracy. The goal of the task, which is presented in the article, is to restore missing relationships. The algorithm presented in the article is based on the assumption that R&D work is carried out by teams of authors, and to determine the authors of the publication, it is enough to identify these teams. The materials of this article will be valuable to researchers and practitioners involved in automating processes within large information-analytical systems in the field of scientometrics and bibliometrics. Implementing the heuristic of authorship teams can significantly enhance the accuracy and performance of several similar-purpose systems, particularly those requiring real-time query processing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call