Abstract

Learning streaming data with limited size of memory storage becomes an interesting problem. Although there have been several learning methods recently proposed, based on the interesting concept of discard-after-learn, the performance of these issues: the learning speed, number of redundant neurons, and classification accuracy of these methods can be further improved in terms of faster speed, less number of neurons, and higher accuracy. The following new concepts and approaches were proposed in this paper: (1) a more generic structure of hyper-ellipsoidal function called Scalable Hyper-Ellipsoidal Function (SHEF) capable of handling the problem of a curse of dimensionality by introducing a regularization parameter into the covariance matrix of SHEF; (2) a new recursive function to update the covariance matrix of SHEF based on only the incoming data chunk; (3) a fast and easy conditions to test the states of being overlapped, inside, and touching of two SHEFs; (4) a new distance measure for determining the class of a queried datum based on the projected distance on only one discriminant vector, namely the Projection Ratio. The experimental results show the significant improvement when compared with the results from VLLDA, ILDA, LOL, VEBF, and CIL in terms of classification accuracy, the number of generated neurons, and computational time.

Highlights

  • Classifying or learning streaming data has been an interesting topic and existed in many fields such as business, academia, and medical information

  • Lursinsap: Scalable Hyper-Ellipsoidal Function (SHEF) With Projection Ratio for Local Distributed Streaming Data Classification. Traditional classification methods such as k-Nearest Neighbors (k-NN), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA) were built on the condition that the entire data set must be kept in the memory during the training process

  • CONSTRAINTS AND STUDIED PROBLEMS The proposed method in this study is based on the hyper-elliptical structure similar to those methods in Versatile Elliptic Basis Function (VEBF) [22], Class-wise Incremental Learning (CIL) [23], Versatile Hyper-Elliptic Clustering (VHEC) [26], and Dynamic multi-Stratum (Dstratum) [27]

Read more

Summary

INTRODUCTION

Classifying or learning streaming data has been an interesting topic and existed in many fields such as business (financial data [1], credit card fraud detection [2]), academia, and medical information (health care sensor [3], EEG signal [4], [5]). Lursinsap: SHEF With Projection Ratio for Local Distributed Streaming Data Classification Traditional classification methods such as k-Nearest Neighbors (k-NN), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA) were built on the condition that the entire data set must be kept in the memory during the training process. Besides the problem of initializing parameters, the procedure to identify the classes of a queried testing datum is mainly based on the distance between the datum and the center of VEBF This distance works well for some applications but it gives inaccurate classification results in most cases because the data density and distribution inside the VEBF is not involved. Another problem not studied in those VEBF-based structure is the situation when the number of feature dimensions is larger than the number of training data.

CONSTRAINTS AND STUDIED PROBLEMS
BASIC CONCEPT OF STANDARD HYPER-ELLIPSOID
CONCEPT OF LDA WITH MULTIPLE CLASSES AND
CHECKING OVERLAP OF TWO HYPER-ELLIPSOIDS
UPDATING PARAMETERS OF SHEF
INITIALIZING SHEF WIDTHS AND THRESHOLD DISTANCE FOR INTRODUCING NEW SHEF
CONDITION OF INTERSECTION OF TWO SCALABLE HYPER-ELLIPSOIDS
PROPOSED LEARNING ALGORITHM OF SHEF
IDENTIFYING CLASSES OF TESTING DATA
If there exists a set of SHEFs of class yi then
26. EndFor
LIMITATION OF EACH DISTANCE METHOD
NEW DISTANCE MEASURE OF SHEF PROJECTION WIDTH
DETERMINING CLASS OF QUERIED DATUM BASED ON PROJECTION RATIO DISTANCE
Findings
DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call