With the advancement in artificial intelligence, the use of machine learning algorithms for clinical prediction has increased tremendously. Logistic regression is one of the powerful machine learning algorithms that can be used to predict the probability of a variable. Logistic regression is very popular among medical researchers owing to its simplicity, interpretability, and solid statistical foundation. This study aims to investigate the research productivity of heart disease classification using a logistic regression model to analyze the current patterns and potential future trends through bibliometric analysis. Additionally, it aims to highlight the impact and quality of research in the area, identify prominent research groups,the countries actively contributing to the field, which will help the researchers andhealthcare professionals to pinpoint research gaps, influential authors, and make informed decisions and invest resources accordingly. The data is collected from a database of Scopus spanning from 2019 to 2023. We have used two bibliometric software, Biblioshiny(Aria and Cuccurullo, 2017) and VOSviewer (Centre for Science and Technology Studies (CWTS), Leiden University,the Netherlands), to analyze the bibliographic data regarding the citation count, contribution of authors, publication count, the contribution of institutions, etc.There are 2331 documents under study which were fed into both software to analyze the data. With 700 documents, China topped the list of most productive countries indicating the vast contribution of the country followed by India and the United States. Contributions of the Harvard Medical School, Boston, MA, United States are found to be the greatest with six papers. The most productive author is Wang Y with 73 documents. Analysis of trending topics reveals that the field progressing towards using support vector machines (SVM), k-nearest neighbours (KNN), and naïveBayes algorithms. The article has only considered data from Scopus excluding literature indexed in other databases which limits the potential coverage of the data. Also, the work focuses on recent developments excluding older literature from 2019 which could be a limitation. Furthermore, since the study is a bibliometric analysis targeting the use of logistic regression for heart disease prediction, powerfultechniques such as SVM, decision trees, random forests, neural networksand deep learning have not been included, which could be another limitation.
Read full abstract