PurposeThis paper presents pyBibX, a Python library devised to conduct comprehensive bibliometric and scientometric analyses on raw data files sourced from Scopus, Web of Science and PubMed, seamlessly integrating state-of-the-art artificial intelligence (AI) capabilities into its core functionality.Design/methodology/approachThe library executes a comprehensive exploratory data analysis (EDA), presenting outcomes via visually appealing graphical illustrations. Network capabilities have been deftly integrated, encompassing citation, collaboration and similarity analysis. Furthermore, the library incorporates AI capabilities, including embedding vectors, topic modeling, text summarization and other general natural language processing tasks, employing models such as sentence-BERT, BerTopic, BERT, chatGPT and PEGASUS.FindingsAs a demonstration, we have analyzed 184 documents associated with “multiple-criteria decision analysis” published between 1984 and 2023. The EDA emphasized a growing fascination with decision-making and fuzzy logic methodologies. Next, network analysis further accentuated the significance of central authors and intra-continental collaboration, identifying Canada and China as crucial collaboration hubs. Finally, AI analysis distinguished two primary topics and chatGPT’s preeminence in text summarization. It also proved to be an indispensable instrument for interpreting results, as our library enables researchers to pose inquiries to chatGPT regarding bibliometric outcomes. Even so, data homogeneity remains a daunting challenge due to database inconsistencies.Originality/valuePyBibX is the first application integrating cutting-edge AI capabilities for analyzing scientific publications, enabling researchers to examine and interpret these outcomes more effectively. pyBibX is freely available at https://bit.ly/442wD5z.
Read full abstract