Abstract

Predatory publishing venues publish questionable articles and pose a global threat to the integrity and quality of the scientific literature. They have given rise to the dark side of scholarly publishing and their effects have reached political, societal, economic, and health aspects. Given their consequences and proliferation, several solutions have been developed to help detect them; however, these solutions are manual and time-consuming. While researchers, students, and readers are in need of a tool that automatically detects predatory venues and their violations, in this study, we proposed an intelligent framework that can automatically detect predatory venues and their violations using different artificial intelligence techniques. This work contributes through the following: (1) creating a dataset of 9,866 journals annotated as predatory and legitimate, and (2) proposing an intelligent framework for classifying a venue as legitimate or predatory, with appropriate reasoning. Our framework was evaluated using seven different machine learning and deep learning models, including Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Neural Networks (NNs), Long short-term memory (LSTM), Convolutional Neural Network (CNN), Bidirectional Encoders from Transformers (BERT), A Lite BERT (ALBERT), and different feature representation techniques. The results showed that the CNN model outperformed the other models in journal classification task, with an F1 score of 0.96. For appropriate reasoning of the provisioning task, the SVM model achieved the best micro F1 of 0.67.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call