Abstract

• This study predicted the clinical citation count of biomedical papers using a multilayer perceptron neural network model, which outperformed other five baseline models (i.e., linear regression, support vector machine, random forest, KNN, and XGBoost). • The most important features for predicting clinical count of biomedical papers are features in the reference dimension, which have been ignored in previous research. Meanwhile, clinical translation-related features are important for predicting clinical count of basic papers but not the papers closer to clinical research. • Features that have previously demonstrated to be highly related to the citation count of academic papers, are not important for the clinical citation count prediction of biomedical papers. • This study could be useful for policymakers and pharmaceutical companies to early assess the translational progress of biomedical research and to monitor the biomedical research with high potential to be clinically translated in real time. The number of clinical citations received from clinical guidelines or clinical trials has been considered as one of the most appropriate indicators for quantifying the clinical impact of biomedical papers. Therefore, the early prediction of clinical citation count of biomedical papers is critical to scientific activities in biomedicine, such as research evaluation, resource allocation, and clinical translation. In this study, we designed a four-layer multilayer perceptron neural network (MPNN) model to predict the clinical citation count of biomedical papers in the future by using 9,822,620 biomedical papers published from 1985 to 2005. We extracted ninety-one paper features from three dimensions as the input of the model, including twenty-one features in the paper dimension, thirty-five in the reference dimension, and thirty-five in the citing paper dimension. In each dimension, the features can be classified into three categories, i.e., the citation-related features, the clinical translation-related features, and the topic-related features. Besides, in the paper dimension, we also considered the features that have previously been demonstrated to be related to the citation counts of research papers. The results showed that the proposed MPNN model outperformed the other five baseline models, and the features in the reference dimension were the most important. In all the three dimensions, the citation-related and topic-related features were more important than the clinical translation-related features for the prediction. It also turned out that the features helpful in predicting the citation count of papers are not important for predicting the clinical citation count of biomedical papers. Furthermore, we explored the MPNN model based on different categories of biomedical papers. The results showed that the clinical translation-related features were more important for the prediction of clinical citation count of basic papers rather than those papers closer to clinical science. This study provided a novel dimension (i.e., the reference dimension) for the research community and could be applied to other related research tasks, such as the research assessment for translational programs. In addition, the findings in this study could be useful for biomedical authors (especially for those in basic science) to get more attention from clinical research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call