Classifier Chains Research Articles

Text data from open-ended questions in surveys are challenging to analyze and are often ignored. Open-ended questions are important though because they do not constrain respondents’ answers. Where open-ended questions are necessary, often human coders manually code answers. When data sets are large, it is impractical or too costly to manually code all answer texts. Instead, text answers can be converted into numerical variables, and a statistical/machine learning algorithm can be trained on a subset of manually coded data. This statistical model is then used to predict the codes of the remainder. We consider open-ended questions where the answers are coded into multiple labels (all-that-apply questions). For example, in the open-ended question in our Happy example respondents are explicitly told they may list multiple things that make them happy. Algorithms for multilabel data take into account the correlation among the answer codes and may therefore give better prediction results. For example, when giving examples of civil disobedience, respondents talking about “minor nonviolent offenses” were also likely to talk about “crimes.” We compare the performance of two different multilabel algorithms (random k-labelsets [RAKEL], classifier chains [CC]) to the default method of binary relevance (BR) which applies single-label algorithms to each code separately. Performance is evaluated on data from three open-ended questions (Happy, Civil Disobedience, and Immigrant). We found weak bivariate label correlations in the Happy data (90th percentile: 7.6%), and stronger bivariate label correlations in the Civil Disobedience (90th percentile: 17.2%) and Immigrant (90th percentile: 19.2%) data. For the data with stronger correlations, we found both multilabel methods performed substantially better than BR using 0/1 loss (“at least one label is incorrect”) and had little effect when using Hamming loss (average error). For data with weak label correlations, we found no difference in performance between multilabel methods and BR. We conclude that automatic classification of open-ended questions that allow multiple answers may benefit from using multilabel algorithms for 0/1 loss. The degree of correlations among the labels may be a useful prognostic tool.

Read full abstract

Although the level of digitalization and automation steadily increases in radiology, billing coding for magnetic resonance imaging (MRI) exams in the radiology department is still based on manual input from the technologist. After the exam completion, the technologist enters the corresponding exam codes that are associated with billing codes in the radiology information system. Moreover, additional billing codes are added or removed, depending on the performed procedure. This workflow is time-consuming and we showed that billing codes reported by the technologists contain errors. The coding workflow can benefit from an automated system, and thus a prediction model for automated assignment of billing codes for MRI exams based on MRI log data is developed in this work. To the best of our knowledge, it is the first attempt to focus on the prediction of billing codes from modality log data. MRI log data provide a variety of information, including the set of executed MR sequences, MR scanner table movements, and given a contrast medium. MR sequence names are standardized using a heuristic approach and incorporated into the features for the prediction. The prediction model is trained on 9754 MRI exams and tested on 1 month of log data (423 MRI exams) from two MRI scanners of the radiology site for the Swiss medical tariffication system Tarmed. The developed model, an ensemble of classifier chains with multilayer perceptron as a base classifier, predicts medical billing codes for MRI exams with a micro-averaged F1-score of 97.8% (recall 98.1%, precision 97.5%). Manual coding reaches a micro-averaged F1-score of 98.1% (recall 97.4%, precision 98.8%). Thus, the performance of automated coding is close to human performance. Integrated into the clinical environment, this work has the potential to free the technologist from a non-value adding an administrative task, therefore enhance the MRI workflow, and prevent coding errors.

Read full abstract

Classifier Chains Research Articles

Related Topics

Articles published on Classifier Chains

Probabilistic regressor chains with Monte Carlo methods

Comparison of base classifiers for multi-label learning

Improving the $$\epsilon $$-approximate algorithm for Probabilistic Classifier Chains

AIEpred: An Ensemble Predictive Model of Classifier Chain to Identify Anti-Inflammatory Peptides

Aspect-Based Sentiment Analysis and Emotion Detection for Code-Mixed Review

Label Specific Features-Based Classifier Chains for Multi-Label Classification

User Clustering for MIMO NOMA via Classifier Chains and Gradient-Boosting Decision Trees

A Machine Learning-Based Recommender System for Improving Students Learning Experiences

Logistics engineering optimization based on machine learning and artificial intelligence technology

Toward systems-centered analysis of patient safety events: Improving root cause analysis by optimized incident classification and information presentation

Dealing with class imbalance in classifier chains via random undersampling

Automatic Classification of Open-Ended Questions: Check-All-That-Apply Questions

Automated Billing Code Retrieval from MRI Scanner Log Data

Chained ensemble classifier for image annotation

Identification of unhealthy Panax notoginseng from different geographical origins by means of multi-label classification

Evaluating multi-label classifiers and recommender systems in the financial service sector

Efficient label ordering for improving multi-label classifier chain accuracy

Conditional entropy based classifier chains for multi-label classification

Ensemble of classifier chains and Credal C4.5 for solving multi-label classification

Multi-label Arabic text categorization: A benchmark and baseline comparison of multi-label learning algorithms

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Classifier Chains Research Articles

Related Topics

Articles published on Classifier Chains

Probabilistic regressor chains with Monte Carlo methods

Comparison of base classifiers for multi-label learning

Improving the $$\epsilon $$-approximate algorithm for Probabilistic Classifier Chains

AIEpred: An Ensemble Predictive Model of Classifier Chain to Identify Anti-Inflammatory Peptides

Aspect-Based Sentiment Analysis and Emotion Detection for Code-Mixed Review

Label Specific Features-Based Classifier Chains for Multi-Label Classification

User Clustering for MIMO NOMA via Classifier Chains and Gradient-Boosting Decision Trees

A Machine Learning-Based Recommender System for Improving Students Learning Experiences

Logistics engineering optimization based on machine learning and artificial intelligence technology

Toward systems-centered analysis of patient safety events: Improving root cause analysis by optimized incident classification and information presentation

Dealing with class imbalance in classifier chains via random undersampling

Automatic Classification of Open-Ended Questions: Check-All-That-Apply Questions

Automated Billing Code Retrieval from MRI Scanner Log Data

Chained ensemble classifier for image annotation

Identification of unhealthy Panax notoginseng from different geographical origins by means of multi-label classification

Evaluating multi-label classifiers and recommender systems in the financial service sector

Efficient label ordering for improving multi-label classifier chain accuracy

Conditional entropy based classifier chains for multi-label classification

Ensemble of classifier chains and Credal C4.5 for solving multi-label classification

Multi-label Arabic text categorization: A benchmark and baseline comparison of multi-label learning algorithms