This study explores the basic system and applicability of decision tree techniques by examining domestic research trends in this field. We analyzed 1,075 academic papers using domestic decision tree analysis from 2001 to 2023, categorized by year, journal, topic field, author, paper frequency, and frequency of algorithms used. Key findings reveal that the frequency of papers began with one in 2001, increasing to 87 in 2023. The distribution across fields was as follows: social science (377 papers, 35.07%), natural science (233 papers, 21.67%), and engineering (230 papers, 21.40%). Among the algorithms, CHAID was used most frequently (315 times, 36.42%), followed by CART (310 times, 35.84%), ensemble techniques like random forest (124 times, 14.34%), and C5.0 (94 times, 10.87%). The Quest algorithm was rarely used (10 times, 1.16%). The annual usage trends indicate a growing preference for ensemble techniques to enhance decision tree prediction rates. The paper's keywords included “decision trees” (687), “data mining” (239), “machine learning” (105), “logistic” (98), and “neural networks” (68). Notably, keywords related to health, such as “suicide,” “depression,” “high blood pressure,” and “health,” emerged prominently, signifying the extensive use of decision trees in medical research. Based on these findings, we propose implications and directions for follow-up studies.
Read full abstract