Number Of Training Instances Research Articles

Quantitative attributes are usually discretized in Naive-Bayes learning. We establish simple conditions under which discretization is equivalent to use of the true probability density function during naive-Bayes learning. The use of different discretization techniques can be expected to affect the classification bias and variance of generated naive-Bayes classifiers, effects we name discretization bias and variance. We argue that by properly managing discretization bias and variance, we can effectively reduce naive-Bayes classification error. In particular, we supply insights into managing discretization bias and variance by adjusting the number of intervals and the number of training instances contained in each interval. We accordingly propose proportional discretization and fixed frequency discretization, two efficient unsupervised discretization methods that are able to effectively manage discretization bias and variance. We evaluate our new techniques against four key discretization methods for naive-Bayes classifiers. The experimental results support our theoretical analyses by showing that with statistically significant frequency, naive-Bayes classifiers trained on data discretized by our new methods are able to achieve lower classification error than those trained on data discretized by current established discretization methods.

Read full abstract

In pattern recognition, instance-based learning (also known as nearest neighbor rule) has become increasingly popular and can yield excellent performance. In instance-based learning, however, the storage of training set rises along with the number of training instances. Moreover, in such a case, a new, unseen instance takes a long time to classify because all training instances have to be considered when determining the ‘nearness’ or ‘similarity’ among instances. This study presents a novel reduced classification method for instance-based learning based on the gray relational structure. Here, only some training instances in the original training set are adopted for the pattern classification tasks. The relationships among instances are first determined according to the gray relational structure. In the relational structure, the inward edges of each training instance, indicating how many times each instance is considered as the nearest neighbor or neighbors in determining the class labels of other instances can be obtained. This method excludes training instances with no or few inward edges for the pattern classification tasks. By using the proposed instance pruning approach, new instances can be classified with a few training instances. Nine data sets are adopted to demonstrate the performance of the proposed learning approach. Experimental results indicate that the classification accuracy can be maintained when most of the training instances are pruned before learning. Additionally, the number of remained training instances in the proposal presented here is comparable to that of other existing instance pruning techniques.

Read full abstract

Number Of Training Instances Research Articles

Articles published on Number Of Training Instances

Improving English verb sense disambiguation performance with linguistically motivated features and clear sense distinction boundaries

Discretization for naive-Bayes learning: managing discretization bias and variance

A novel gray-based reduced NN classification method

PARALLEL NEURAL LEARNING FOR CONTROL PROBLEMS ON A BUS-BASED ARCHITECTURE

A Nearly Optimal Back-Propagation Learning Algorithm on a Bus-Based Architecture

IGTree: using trees for compression and classification in lazy learning algorithms

An efficient inductive learning method for object-oriented database using attribute entropy

Learning capacity and sample complexity on expert networks

Learning concepts in parallel based upon the strategy of version space

Parallel perceptron learning on a single-channel broadcast communication model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Number Of Training Instances Research Articles

Articles published on Number Of Training Instances

Improving English verb sense disambiguation performance with linguistically motivated features and clear sense distinction boundaries

Discretization for naive-Bayes learning: managing discretization bias and variance

A novel gray-based reduced NN classification method

PARALLEL NEURAL LEARNING FOR CONTROL PROBLEMS ON A BUS-BASED ARCHITECTURE

A Nearly Optimal Back-Propagation Learning Algorithm on a Bus-Based Architecture

IGTree: using trees for compression and classification in lazy learning algorithms

An efficient inductive learning method for object-oriented database using attribute entropy

Learning capacity and sample complexity on expert networks

Learning concepts in parallel based upon the strategy of version space

Parallel perceptron learning on a single-channel broadcast communication model