Single-label Classification Research Articles

Every year, around 28,100 journals publish 2.5 million research publications. Search engines, digital libraries, and citation indexes are used extensively to search these publications. When a user submits a query, it generates a large number of documents among which just a few are relevant. Due to inadequate indexing, the resultant documents are largely unstructured. Publicly known systems mostly index the research papers using keywords rather than using subject hierarchy. Numerous methods reported for performing single-label classification (SLC) or multi-label classification (MLC) are based on content and metadata features. Content-based techniques offer higher outcomes due to the extreme richness of features. But the drawback of content-based techniques is the unavailability of full text in most cases. The use of metadata-based parameters, such as title, keywords, and general terms, acts as an alternative to content. However, existing metadata-based techniques indicate low accuracy due to the use of traditional statistical measures to express textual properties in quantitative form, such as BOW, TF, and TFIDF. These measures may not establish the semantic context of the words. The existing MLC techniques require a specified threshold value to map articles into predetermined categories for which domain knowledge is necessary. The objective of this paper is to get over the limitations of SLC and MLC techniques. To capture the semantic and contextual information of words, the suggested approach leverages the Word2Vec paradigm for textual representation. The suggested model determines threshold values using rigorous data analysis, obviating the necessity for domain expertise. Experimentation is carried out on two datasets from the field of computer science (JUCS and ACM). In comparison to current state-of-the-art methodologies, the proposed model performed well. Experiments yielded average accuracy of 0.86 and 0.84 for JUCS and ACM for SLC, and 0.81 and 0.80 for JUCS and ACM for MLC. On both datasets, the proposed SLC model improved the accuracy up to 4%, while the proposed MLC model increased the accuracy up to 3%.

Read full abstract

Deep Neural Network (DNN) models are widely used for image classification. While they offer high performance in terms of accuracy, researchers are concerned about if these models inappropriately make inferences using features irrelevant to the target object in a given image. To address this concern, we propose a metamorphic testing approach that assesses if a given inference is made based on irrelevant features. Specifically, we propose two metamorphic relations (MRs) to detect such unreliable inferences. These relations expect (a) the classification results with different labels or the same labels but less certainty from models after corrupting the relevant features of images, and (b) the classification results with the same labels after corrupting irrelevant features. The inferences that violate the metamorphic relations are regarded as unreliable inferences. Our evaluation demonstrated that our approach can effectively identify unreliable inferences for single-label classification models with an average precision of 64.1% and 96.4% for the two MRs, respectively. As for multi-label classification models, the corresponding precision for MR-1 and MR-2 is 78.2% and 86.5%, respectively. Further, we conducted an empirical study to understand the problem of unreliable inferences in practice. Specifically, we applied our approach to 18 pre-trained single-label image classification models and 3 multi-label classification models, and then examined their inferences on the ImageNet and COCO datasets. We found that unreliable inferences are pervasive. Specifically, for each model, more than thousands of correct classifications are actually made using irrelevant features. Next, we investigated the effect of such pervasive unreliable inferences, and found that they can cause significant degradation of a model’s overall accuracy. After including these unreliable inferences from the test set, the model’s accuracy can be significantly changed. Therefore, we recommend that developers should pay more attention to these unreliable inferences during the model evaluations. We also explored the correlation between model accuracy and the size of unreliable inferences. We found the inferences of the input with smaller objects are easier to be unreliable. Lastly, we found that the current model training methodologies can guide the models to learn object-relevant features to certain extent, but may not necessarily prevent the model from making unreliable inferences. We encourage the community to propose more effective training methodologies to address this issue.

Read full abstract

Single-label Classification Research Articles

Related Topics

Articles published on Single-label Classification

Multi-Self-Attention for Aspect Category Detection and Biomedical Multilabel Text Classification with BERT

Multi-label classification of research articles using Word2Vec and identification of similarity threshold

Multilabel Classification with Partial Abstention: Bayes-Optimal Prediction under Label Independence

Bioinspired Visual-Integrated Model for Multilabel Classification of Textile Defect Images

Harnessing Multi-label Classification Approaches for Economic Phenomena Categorization

Detecting Multi-label emotions from code-mixed Facebook Status Updates

Multi-Label Classification of Research Papers Using Multi-Label K-Nearest Neighbour Algorithm

CNN-based multi-class multi-label classification of sound scenes in the context of wind turbine sound emission measurements

An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks

An experimental approach for prediction of multi-classification using SVM

To what extent do DNN-based image classification models make unreliable inferences?

A Real-Time Image Semantic Segmentation Method Based on Multilabel Classification

Condition-CNN: A hierarchical multi-label fashion image classification model

Efficient set-valued prediction in multi-class classification

A Multilabel Classifier for Text Classification and Enhanced BERT System

Rotation Forest for multi-target regression

Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree

Sentiment analysis of GO-JEK services quality using Multi-Label Classification

A new fusion framework for motion segmentation in dynamic scenes

An intelligent work order classification model for government service based on multi-label neural network

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Single-label Classification Research Articles

Related Topics

Articles published on Single-label Classification

Multi-Self-Attention for Aspect Category Detection and Biomedical Multilabel Text Classification with BERT

Multi-label classification of research articles using Word2Vec and identification of similarity threshold

Multilabel Classification with Partial Abstention: Bayes-Optimal Prediction under Label Independence

Bioinspired Visual-Integrated Model for Multilabel Classification of Textile Defect Images

Harnessing Multi-label Classification Approaches for Economic Phenomena Categorization

Detecting Multi-label emotions from code-mixed Facebook Status Updates

Multi-Label Classification of Research Papers Using Multi-Label K-Nearest Neighbour Algorithm

CNN-based multi-class multi-label classification of sound scenes in the context of wind turbine sound emission measurements

An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks

An experimental approach for prediction of multi-classification using SVM

To what extent do DNN-based image classification models make unreliable inferences?

A Real-Time Image Semantic Segmentation Method Based on Multilabel Classification

Condition-CNN: A hierarchical multi-label fashion image classification model

Efficient set-valued prediction in multi-class classification

A Multilabel Classifier for Text Classification and Enhanced BERT System

Rotation Forest for multi-target regression

Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree

Sentiment analysis of GO-JEK services quality using Multi-Label Classification

A new fusion framework for motion segmentation in dynamic scenes

An intelligent work order classification model for government service based on multi-label neural network