Number Of Training Examples Research Articles

Many recent developments in machine learning have come from the field of “deep learning,” or the use of advanced neural network architectures and techniques. While these methods have produced state-of-the-art results and dominated research focus in many fields, such as image classification and natural language processing, they have not gained as much ground over standard multivariate pattern analysis (MVPA) techniques in the classification of electroencephalography (EEG) or other human neuroscience datasets. The high dimensionality and large amounts of noise present in EEG data, coupled with the relatively low number of examples (trials) that can be reasonably obtained from a sample of human subjects, lead to difficulty training deep learning models. Even when a model successfully converges in training, significant overfitting can occur despite the presence of regularization techniques. To help alleviate these problems, we present a new method of “paired trial classification” that involves classifying pairs of EEG recordings as coming from the same class or different classes. This allows us to drastically increase the number of training examples, in a manner akin to but distinct from traditional data augmentation approaches, through the combinatorics of pairing trials. Moreover, paired trial classification still allows us to determine the true class of a novel example (trial) via a “dictionary” approach: compare the novel example to a group of known examples from each class, and determine the final class via summing the same/different decision values within each class. Since individual trials are noisy, this approach can be further improved by comparing a novel individual example with a “dictionary” in which each entry is an average of several examples (trials). Even further improvements can be realized in situations where multiple samples from a single unknown class can be averaged, thus permitting averaged signals to be compared with averaged signals.

Read full abstract

Abstract Purpose With more and more digital collections of various information resources becoming available, also increasing is the challenge of assigning subject index terms and classes from quality knowledge organization systems. While the ultimate purpose is to understand the value of automatically produced Dewey Decimal Classification (DDC) classes for Swedish digital collections, the paper aims to evaluate the performance of six machine learning algorithms as well as a string-matching algorithm based on characteristics of DDC. Design/methodology/approach State-of-the-art machine learning algorithms require at least 1,000 training examples per class. The complete data set at the time of research involved 143,838 records which had to be reduced to top three hierarchical levels of DDC in order to provide sufficient training data (totaling 802 classes in the training and testing sample, out of 14,413 classes at all levels). Findings Evaluation shows that Support Vector Machine with linear kernel outperforms other machine learning algorithms as well as the string-matching algorithm on average; the string-matching algorithm outperforms machine learning for specific classes when characteristics of DDC are most suitable for the task. Word embeddings combined with different types of neural networks (simple linear network, standard neural network, 1D convolutional neural network, and recurrent neural network) produced worse results than Support Vector Machine, but reach close results, with the benefit of a smaller representation size. Impact of features in machine learning shows that using keywords or combining titles and keywords gives better results than using only titles as input. Stemming only marginally improves the results. Removed stop-words reduced accuracy in most cases, while removing less frequent words increased it marginally. The greatest impact is produced by the number of training examples: 81.90% accuracy on the training set is achieved when at least 1,000 records per class are available in the training set, and 66.13% when too few records (often less than 100 per class) on which to train are available—and these hold only for top 3 hierarchical levels (803 instead of 14,413 classes). Research limitations Having to reduce the number of hierarchical levels to top three levels of DDC because of the lack of training data for all classes, skews the results so that they work in experimental conditions but barely for end users in operational retrieval systems. Practical implications In conclusion, for operative information retrieval systems applying purely automatic DDC does not work, either using machine learning (because of the lack of training data for the large number of DDC classes) or using string-matching algorithm (because DDC characteristics perform well for automatic classification only in a small number of classes). Over time, more training examples may become available, and DDC may be enriched with synonyms in order to enhance accuracy of automatic classification which may also benefit information retrieval performance based on DDC. In order for quality information services to reach the objective of highest possible precision and recall, automatic classification should never be implemented on its own; instead, machine-aided indexing that combines the efficiency of automatic suggestions with quality of human decisions at the final stage should be the way for the future. Originality/value The study explored machine learning on a large classification system of over 14,000 classes which is used in operational information retrieval systems. Due to lack of sufficient training data across the entire set of classes, an approach complementing machine learning, that of string matching, was applied. This combination should be explored further since it provides the potential for real-life applications with large target classification systems.

Read full abstract

Number Of Training Examples Research Articles

Articles published on Number Of Training Examples

Efficiently approximating selectivity functions using low overhead regression models

AMP0: Species-Specific Prediction of Anti-microbial Peptides Using Zero and Few Shot Learning.

Deep learning for symbols detection and classification in engineering drawings

Analysis of the Impact of Removal of Aftershocks from Catalogs on the Effectiveness of Systematic Earthquake Prediction

Learning transferable features in meta-learning for few-shot text classification

Detection of liner surface defects in solid rocket motors using multilayer perceptron neural networks

Paired Trial Classification: A Novel Deep Learning Technique for MVPA.

Task-Agnostic Object Recognition for Mobile Robots through Few-Shot Image Matching

Small Clues Tell: a Collaborative Expansion Approach for Effective Content-Based Recommendations

Feed-forward versus recurrent architecture and local versus cellular automata distributed representation in reservoir computing for sequence memory learning

Automatic Classification of Swedish Metadata Using Dewey Decimal Classification: A Comparison of Approaches

Sentiment analysis of Twitter data during critical events through Bayesian networks classifiers

On the Importance of Visual Context for Data Augmentation in Scene Understanding.

Deterministic dropout for deep neural networks using composite random forest

High-Dimensional Nonconvex Stochastic Optimization by Doubly Stochastic Successive Convex Approximation

Data Augmentation with Suboptimal Warping for Time-Series Classification.

Dealing with class imbalance in classifier chains via random undersampling

BCT Boost Segmentation with U-net in TensorFlow

A machine-learning based ensemble method for anti-patterns detection

Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Number Of Training Examples Research Articles

Articles published on Number Of Training Examples

Efficiently approximating selectivity functions using low overhead regression models

AMP0: Species-Specific Prediction of Anti-microbial Peptides Using Zero and Few Shot Learning.

Deep learning for symbols detection and classification in engineering drawings

Analysis of the Impact of Removal of Aftershocks from Catalogs on the Effectiveness of Systematic Earthquake Prediction

Learning transferable features in meta-learning for few-shot text classification

Detection of liner surface defects in solid rocket motors using multilayer perceptron neural networks

Paired Trial Classification: A Novel Deep Learning Technique for MVPA.

Task-Agnostic Object Recognition for Mobile Robots through Few-Shot Image Matching

Small Clues Tell: a Collaborative Expansion Approach for Effective Content-Based Recommendations

Feed-forward versus recurrent architecture and local versus cellular automata distributed representation in reservoir computing for sequence memory learning

Automatic Classification of Swedish Metadata Using Dewey Decimal Classification: A Comparison of Approaches

Sentiment analysis of Twitter data during critical events through Bayesian networks classifiers

On the Importance of Visual Context for Data Augmentation in Scene Understanding.

Deterministic dropout for deep neural networks using composite random forest

High-Dimensional Nonconvex Stochastic Optimization by Doubly Stochastic Successive Convex Approximation

Data Augmentation with Suboptimal Warping for Time-Series Classification.

Dealing with class imbalance in classifier chains via random undersampling

BCT Boost Segmentation with U-net in TensorFlow

A machine-learning based ensemble method for anti-patterns detection

Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities