Learnability for Binary Classification
This chapter contains the proof of the fundamental theorem of PAC-learning for binary classification. In particular, the so-called uniform convergence property is introduced and used to show that finite VC-dimension implies PAC-learnability. Conversely, the so-called no-free-lunch theorem is used to show that PAC-learnability implies finite VC-dimension.
- Video Transcripts
- 10.48448/s6c9-5208
- Dec 29, 2020
- Underline Science Inc.
The softmax and binary classifier are commonly preferred for image classification applications. However, as softmax is specifically designed for categorical classification, it assumes each image has just one class label. This limits its applicability for problems where the number of labels does not equal one, most notably zero- and multi-label problems. In these challenging settings, binary classifiers are, in theory, better suited. However, as they ignore the correlation between classes, they are not as accurate and scalable in practice. In this paper, we start from the observation that the only difference between binary and softmax classifiers is their normalization function. Specifically, while the binary classifier self-normalizes its score, the softmax classifier combines the scores from all classes before normalisation. On the basis of this observation we introduce a normalization function that is learnable, constant, and shared between classes and data points. By doing so, we arrive at a new type of binary classifier that we coin quasibinary classifier. We show in a variety of image classification settings, and on several datasets, that quasibinary classifiers are considerably better in classification settings where regular binary and softmax classifiers suffer, including zero-label and multi-label classification. What is more, we show that quasibinary classifiers yield well-calibrated probabilities allowing for direct and reliable comparisons, not only between classes but also between data points.
- Conference Article
- 10.1109/icpr48806.2021.9412933
- Jan 10, 2021
The softmax and binary classifier are commonly preferred for image classification applications. However, as softmax is specifically designed for categorical classification, it assumes each image has just one class label. This limits its applicability for problems where the number of labels does not equal one, most notably zero- and multi-label problems. In these challenging settings, binary classifiers are, in theory, better suited. However, as they ignore the correlation between classes, they are not as accurate and scalable in practice. In this paper, we start from the observation that the only difference between binary and softmax classifiers is their normalization function. Specifically, while the binary classifier self-normalizes its score, the softmax classifier combines the scores from all classes before normalisation. On the basis of this observation we introduce a normalization function that is learnable, constant, and shared between classes and data points. By doing so, we arrive at a new type of binary classifier that we coin quasibinary classifier. We show in a variety of image classification settings, and on several datasets, that quasibinary classifiers are considerably better in classification settings where regular binary and softmax classifiers suffer, including zero-label and multi-label classification. What is more, we show that quasibinary classifiers yield well-calibrated probabilities allowing for direct and reliable comparisons, not only between classes but also between data points.
- Research Article
39
- 10.1259/bjr.20211253
- Jun 9, 2022
- The British journal of radiology
To employ different automated convolutional neural network (CNN)-based transfer learning (TL) methods for both binary and multiclass classification of Alzheimer's disease (AD) using brain MRI. Herein, we applied three popular pre-trained CNN models (ResNet101, Xception, and InceptionV3) using a fine-tuned approach of TL on 3D T1-weighted brain MRI from a subset of ADNI dataset (n = 305 subjects). To evaluate power of TL, the aforementioned networks were also trained from scratch for performance comparison. Initially, Unet network segmentedthe MRI scans into characteristic components of gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). The proposed networks were trained and tested over the pre-processed and augmented segmented and whole images for both binary (NC/AD + progressive mild cognitive impairment (pMCI)+stable MCI (sMCI)) and 4-class (AD/pMCI/sMCI/NC) classification. Also, two independent test sets from the OASIS (n = 30) and AIBL (n = 60) datasets were used to externally assess the performance of the proposed algorithms. The proposed TL-based CNN models achieved better performance compared to the training CNN models from scratch. On the ADNI test set, InceptionV3-TL achieved the highest accuracy of 93.75% and AUC of 92.0% for binary classification, as well as the highest accuracy of 93.75% and AUC of 96.0% for multiclass classification of AD on the whole images. On the OASIS test set, InceptionV3-TL outperformed two other models by achieving 93.33% accuracy with 93.0% AUC in binary classification of AD on the whole images. On the AIBL test set, InceptionV3-TL also outperformed two other models in both binary and multiclass classification tasks on the whole MR images and achieved accuracy/AUC of 93.33%/95.0% and 90.0%/93.0%, respectively. The GM segment as input provided the highest performance in both binary and multiclass classification of AD, as compared to the WM and CSF segments. This study demonstrates the potential of applying deep TL approach for automated detection and classification of AD using brain MRI with high accuracy and robustness across internal and external test data, suggesting that these models can possibly be used as a supportive tool to assist clinicians in creating objective opinion and correct diagnosis. We used CNN-based TL approaches and the augmentation techniques to overcome the insufficient data problem. Our study provides evidence that deep TL algorithms can be used for both binary and multiclass classification of AD with high accuracy.
- Research Article
42
- 10.1080/01431161.2017.1416697
- Jan 3, 2018
- International Journal of Remote Sensing
ABSTRACTMany applications of remote sensing only require the classification of a single land type. This is known as the one-class classification problem and it can be performed using either binary classifiers, by treating all other classes as the negative class, or one-class classifiers which only consider the class of interest. The key difference between these two approaches is in their training data and the amount of effort needed to produce it. Binary classifiers require an exhaustively labelled training data set while one-class classifiers are trained using samples of just the class of interest. Given ample and complete training data, binary classifiers generally outperform one-class classifiers. However, what is not clear is which approach is more accurate when given the same amount of labelled training data. That is, for a fixed labelling effort, is it better to use a binary or one-class classifier. This is the question we consider in this article. We compare several binary classifiers, including backpropagation neural networks, support vector machines, and maximum likelihood classifiers, with two one-class classifiers, one-class SVM, and presence and background learning (PBL), on the problem of one-class classification in high-resolution remote sensing imagery. We show that, given a fixed labelling budget, PBL consistently outperforms the other methods. This advantage stems from the fact that PBL is a positive-unlabelled method in which large amounts of readily available unlabelled data is incorporated into the training phase, allowing the classifier to model the negative class more effectively.
- Research Article
- 10.1186/s12903-025-07171-z
- Dec 1, 2025
- BMC Oral Health
Artificial intelligence (AI) has shown promise for diagnosing periodontal disease from dental radiographs. However, diagnostic performance across classification types (binary classification vs. staging classification) and imaging modalities remains unclear. This meta-analysis evaluates the accuracy of AI diagnostics for periodontitis, comparing binary and staging classifications across various imaging modalities. A systematic meta-analysis reviewed AI-based periodontal diagnostic studies using periapical, panoramic, bitewing, or cone-beam computed tomographic radiographs. Random-effects models calculated pooled sensitivity, specificity, accuracy, F1-score, and area under the curve. Subgroup analyses were performed by imaging modality and heterogeneity (I²). In binary classification, periapical imaging showed a sensitivity of 87.2% and a specificity of 81.5%, while panoramic radiographs had an accuracy of 88.2%. In staging classification, panoramic images achieved the highest accuracy (88.9%) and specificity (85.4%), whereas periapical images showed higher sensitivity (76.4%). Diagnostic accuracy varied significantly across imaging modalities, contributing to heterogeneity among studies. This first meta-analysis comparing binary and staging AI classification emphasizes modality-specific approaches: panoramic imaging is suitable for screening and staging, whereas periapical radiographs support early detection, providing essential insights for clinical AI integration.
- Conference Article
2
- 10.15405/epsbs.2021.12.96
- Dec 2, 2021
- The European Proceedings of Social & Behavioural Sciences
A well-organized binary class at an agricultural university provides students with a good opportunity for professionally oriented language learning, increasing their motivation and demonstrating the possibilities of English in the profession. A binary class combines the possibilities of practicing skills and abilities of using special vocabulary, building an independent statement and listening to a foreign speech. The study involved the integration of educational material of two disciplines: English and machine parts and the basics of design in binary classes in the technopark of the road department of Ryazan State Agrotechnological University. The binary class included planning, creating a group of teachers, designing and conducting the class, and its analysis. The key to the success of the class was a good team of a foreign language teacher and a teacher who supervised a professional course of study and understood English quite well. The structure of the class was a sequential alternation of stages, including receiving a portion of theoretical material and practical exercises in English, aimed at training the use of professional terminology, dialogues and minimal monologues. The analysis of the binary class was carried out even during its course, making notes concerning errors in the use of lexical and grammatical material in order to focus the students’ attention in the next English class. The questionnaire survey of the students demonstrated the assessment of advantages of such an English class, confirmed its consistency and practical significance.
- Conference Article
95
- 10.1109/icmla.2012.212
- Dec 1, 2012
Binary classifiers have typically been the norm for building classification models in the Machine Learning community. However, an alternate to binary classification is one-class classification, which aims to build models using only a single class of data. This is particularly useful when there is an over-abundance of data of a particular class. In such imbalanced cases, binary classifiers may not perform very well, and one-class classifiers then become the viable option. In this paper, we are interested in investigating the performance of binary and one-class classifiers as the level of imbalance increases, and, thus, uncertainty in the second class. Our objective is to gain insight into which classification paradigm becomes more suitable as imbalance and uncertainty increase. To this end, we conduct experiments on various datasets, both artificial and from the UCI repository, and monitor the performance of the binary and one-class classifiers as the size of the second class gradually decreases, thus increasing the level of imbalance. The results show that as the level of imbalance increases, the performance of binary classifiers decreases, whereas one-class classifiers stay relatively stable.
- Research Article
13
- 10.1186/s13635-024-00184-1
- Dec 20, 2024
- EURASIP Journal on Information Security
Network security become imperative in the context of our interconnected networks and everyday communications. Recently, many deep learning models have been proposed to tackle the problem of predicting intrusions and malicious activities in interconnected systems. However, they solely focus on binary classification and lack reporting on individual class performance in case of multi-class classification. Moreover, many of them are trained and tested using outdated datasets which eventually impact the overall performance. Therefore, there is a need for an efficient and accurate network intrusion detection system. In this paper, we propose a novel intelligent detection system based on convolutional neural network, namely DCNN. The proposed model can be utilized to efficiently analyze and detect attacks and intrusions in intelligent network systems (e.g., suspicious network traffic activities and policy violations). The DCNN model is applied against three benchmark datasets and compared with state-of-the-art models. Experimental results show that the proposed model improved resilience to intrusions and malicious activities for binary as well as multi-class classification, expanding its applicability across different intrusion detection scenarios. Furthermore, our DCNN model outperforms similar intrusion detection systems in terms of positive predicted value, true positive rate, F1 measure, and accuracy. The scores obtained for binary and multi-class classifications on the CICIoT2023 dataset are 99.50% and 99.25%, respectively. Additionally, for the CICIDS-2017 dataset, DCNN attains a score of 99.96% for both binary and multi-class classifications, while the CICIoMT2024 dataset attains a score of 99.98% and 99.86% for binary and multi-class classifications, respectively.
- Research Article
2
- 10.26483/ijarcs.v9i2.5866
- Feb 20, 2018
- International Journal of Advanced Research in Computer Science
Sentiment analysis is nowadays quite a hot topic for research. Since most of the research is been done on the data acquired from the social networking sites mostly twitter and is subsequently classified into binary classification (“positive” and “negative”) or the ternary classification (“positive”, “negative”, and “neutral”). The binary and ternary classification is not going to serve the sole purpose of sentimental analysis. Multi-class classification can help in getting the essence and core message from the data. Whether it is binary, ternary or multi-class classification, the main objective always remains the accuracy of finding the actual sentiments. Since ample work has been done on binary and ternary classification and the better accuracy has been achieved but in case of multi-class classification accuracy is still a challenge. In this paper, we will analyze different machine learning algorithms and techniques that have been used in the sentimental analysis and the accuracy achieved using those algorithms and techniques.
- Research Article
12
- 10.3390/cancers14205003
- Oct 13, 2022
- Cancers
Simple SummaryThe DL model predictions in automated breast density assessment were independent of the imaging technologies, moderately or substantially agreed with the clinical reader density values, and had improved performance as compared to inclusion of commercial software values.Recently, convolutional neural network (CNN) models have been proposed to automate the assessment of breast density, breast cancer detection or risk stratification using single image modality. However, analysis of breast density using multiple mammographic types using clinical data has not been reported in the literature. In this study, we investigate pre-trained EfficientNetB0 deep learning (DL) models for automated assessment of breast density using multiple mammographic types with and without clinical information to improve reliability and versatility of reporting. 120,000 for-processing and for-presentation full-field digital mammograms (FFDM), digital breast tomosynthesis (DBT), and synthesized 2D images from 5032 women were retrospectively analyzed. Each participant underwent up to 3 screening examinations and completed a questionnaire at each screening encounter. Pre-trained EfficientNetB0 DL models with or without clinical history were optimized. The DL models were evaluated using BI-RADS (fatty, scattered fibroglandular densities, heterogeneously dense, or extremely dense) versus binary (non-dense or dense) density classification. Pre-trained EfficientNetB0 model performances were compared using inter-observer and commercial software (Volpara) variabilities. Results show that the average Fleiss’ Kappa score between-observers ranged from 0.31–0.50 and 0.55–0.69 for the BI-RADS and binary classifications, respectively, showing higher uncertainty among experts. Volpara-observer agreement was 0.33 and 0.54 for BI-RADS and binary classifications, respectively, showing fair to moderate agreement. However, our proposed pre-trained EfficientNetB0 DL models-observer agreement was 0.61–0.66 and 0.70–0.75 for BI-RADS and binary classifications, respectively, showing moderate to substantial agreement. Overall results show that the best breast density estimation was achieved using for-presentation FFDM and DBT images without added clinical information. Pre-trained EfficientNetB0 model can automatically assess breast density from any images modality type, with the best results obtained from for-presentation FFDM and DBT, which are the most common image archived in clinical practice.
- Conference Article
10
- 10.1109/icceet.2012.6203750
- Mar 1, 2012
The analysis of security attacks over the data packets over the networks is used to detect the detection layer of the low-level intrusion detection like firewalls etc. the intrusion detection based on the signature based detection to detect the attacks. For the analysis of the packets using the data mining technique binary classifiers and multi boosting simultaneously. Using of binary classifiers for each type of attack can be more accurate to improve the detecting of attacks. Based on accurate binary classifiers used to give alert aggregate over the single data packet and identify the attacks. the potential bias of certain binary classifier to be used over the multi boosting technique to reduce both the variance and bias. The multi boosting used verifies the binary tree classifier simultaneously to detect the attacks and provide the system from malicious intrusion detection. A data set is used to check with the binary tree to detect the correlation of data attacks. It is used to reduce the components of abstraction and complexity of attacks.
- Research Article
15
- 10.1038/s41598-024-71860-y
- Sep 5, 2024
- Scientific Reports
This study aimed to develop and validate a multi-modality radiomics approach using T1-weighted and diffusion tensor imaging (DTI) to differentiate Parkinson's disease (PD) motor subtypes, specifically tremor-dominant (TD) and postural instability gait difficulty (PIGD), in early disease stages. We analyzed T1-weighted and DTI scans from 140 early-stage PD patients (70 TD, 70 PIGD) and 70 healthy controls from the Parkinson's Progression Markers Initiative database. Radiomics features were extracted from 16 brain regions of interest. After harmonization and feature selection, four machine learning classifiers were trained and evaluated for both three-class (HC vs TD vs PIGD) and binary (TD vs PIGD) classification tasks. The light gradient boosting machine (LGBM) classifier demonstrated the best overall performance. For the three-class classification, LGBM achieved an accuracy of 85% and an area under the receiver operating characteristic curve (AUC) of 0.94 using combined T1 and DTI features. In the binary classification task, LGBM reached an accuracy of 95% and AUC of 0.95. Key discriminative features were identified in the Thalamus, Amygdala, Hippocampus, and Substantia Nigra for the three-group classification, and in the Pallidum, Amygdala, Hippocampus, and Accumbens for binary classification. The combined T1 + DTI approach consistently outperformed single-modality classifications, with DTI alone showing particularly low performance (AUC 0.55–0.62) in binary classification. The high accuracy and AUC values suggest that this approach could significantly improve early diagnosis and subtyping of PD. These findings have important implications for clinical management, potentially enabling more personalized treatment strategies based on early, accurate subtype identification.
- Research Article
92
- 10.1016/j.bspc.2014.02.001
- Mar 13, 2014
- Biomedical Signal Processing and Control
Pathological voice detection and binary classification using MPEG-7 audio features
- Research Article
41
- 10.1155/2021/6129210
- Dec 24, 2021
- Security and Communication Networks
Digital systems are changing to security systems in contemporary days. It is time for the digital system to have sufficient security to defend against threats and attacks. The intrusion detection system can identify an anomaly from an external or internal source in the network system. Many kinds of threats are present, that is, active and passive. These dangers could lead to anomalies in the system by which data can be attacked and taken by attackers from the beginning to the destination. Machine learning nowadays is a developing topic; its applications are wide. We can forecast the future through machine learning and classify the right class. In this paper, we employed the new binary and multiclass classification model of Convolutional Neural Networks (CNNs) to identify the anomaly of the network system. In this respect, we used the NSLKDD dataset. Our model uses a Convolutional Neural Network (CNN) to conduct binary and multiclass classification. In both datasets, we build a DL-based DoS detection model. We focus on the DoS category in the most extensively used IDS dataset, KDD. As the name implies, CNN is the most extensively used the DL model for image recognition. Adding a pooling layer to the convolution layer minimizes the size of the feature data extracted from the image while maintaining I/O and spatial information. The CNN model has shown the promising results of multiclass and binary classification in terms of validation loss of 0.0012 at 11th epochs and validation accuracy of 98% and 99%, respectively.
- Book Chapter
- 10.1007/978-3-031-25194-8_3
- Jan 1, 2023
Highly over-parameterized neural networks have the ability to completely shatter the entire dataset. Despite their huge capacity, over-parameterized networks show strong algorithmic stability. In cases where semantic and localized information is completely lost, over-parameterized networks show learning. In this paper, the learning and memorizing abilities of over-parameterized neural networks on CIFAR data have been tested. Four architectures are used in this proposed work: VGG (modified), tNet, smallNet, and convNet. On binary classification, all models achieved approx. 100% training accuracy and approx. 96% test accuracy or higher. On a ten-class classification, all models achieved approximately 100% training accuracy except smallNet (80% training accuracy). VGG (modified) achieved 75% test accuracy, smallNet 78%, tNet 69%, and convNet 85%. When semantic and localized meanings are distorted by shuffling the bytes, all models achieve approximately 100% training accuracy, except smallNet (93% and 28% training accuracy), on binary and ten-class classification. On binary classification, all models achieved test accuracy of approximately 74–75% and on ten-class classification, 18–21% test accuracy, which is better than random guessing.