Abstract

As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable to human screening. We followed a three-phase process to develop and test an automated machine learning-based classifier for screening potential studies on interventions for primary prevention of cardiovascular disease. We labelled a total of 16,611 articles during the first phase of the project. In the second phase, we used the labelled articles to develop a machine learning-based classifier. After that, we examined the performance of the classifiers in correctly labelling the papers. We evaluated the performance of the five deep-learning models [i.e. parallel convolutional neural network ( CNN ), stacked CNN , parallel-stacked CNN , recurrent neural network ( RNN ) and CNN-RNN]. The models were evaluated using recall, precision and work saved over sampling at no less than 95% recall. We labelled a total of 16,611 articles, of which 676 (4.0%) were tagged as 'relevant' and 15,935 (96%) were tagged as 'irrelevant'. The recall ranged from 51.9% to 96.6%. The precision ranged from 64.6% to 99.1%. The work saved over sampling ranged from 8.9% to as high as 92.1%. The best-performing model was parallel CNN , yielding a 96.4% recall, as well as 99.1% precision, and a potential workload reduction of 89.9%. We used words from the title and the abstract only. More work needs to be done to look into possible changes in performance, such as adding features such as full document text. The approach might also not be able to be used for other complex systematic reviews on different topics. Our study shows that machine learning has the potential to significantly aid the labour-intensive screening of abstracts in systematic reviews of complex interventions. Future research should concentrate on enhancing the classifier system and determining how it can be integrated into the systematic review workflow. This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call