Abstract

BackgroundAutomated supervised text classification methods require preclassified training data. Their application in scenarios that a large amount of preclassified data is not accessible is challenging. Neurosurgical literature classification into subspecialties is an example of this situation. We have introduced an automated similarity-based text classification method, evaluated it along with two other automated methods and applied the introduced method in neurosurgical literature classification. MethodsPerformance of an introduced similarity-based text classification method along with two other automated methods (Lbl2Vec and keyword counting-based methods) was compared with performance of two senior neurosurgery registrars in classification of neurosurgical literature to 5 subspecialties. The Kappa-statistic measure of interrater agreement, overall marginal homogeneity using the Stuart-Maxwell test, marginal homogeneity relative to individual categories using McNemar tests and the sensitivity and specificity of each of the three methods were calculated.The introduced method was used to classify 211617 neurosurgical publications indexed in Pubmed to different subspecialties based on keywords extracted from subspecialty sections of a neurosurgery textbook. ResultsThe introduced similarity-based method showed the highest agreement with the registrars (raw agreement and Kappa value) followed by the Lbl2Vec and the counting-based method. Classifications of the English neurosurgical publications indexed in Pubmed into categories of Oncology, Vascular, Spine and functional using the introduced similarity-based method were more reliable (closer to the registrars’ classifications) than Cranial trauma. The classifications and future forecast showed highest publications in Oncology, followed by Cranial trauma, Vascular, spine and functional neurosurgery. ConclusionThe classification of the English neurosurgical publications indexed in Pubmed to different subspecialties, using the introduced method, shows that Oncology and tumour has been the main battleground for the neurosurgeons over years and probably in the near future. The performance of the introduced classification method in comparison with the human performance shows its potential application in the situations that enough preclassified data are not accessible for automated text classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call