Abstract
Due to the fast speed of data generation and collection from advanced equipment, the amount of data obviously overflows the limit of available memory space and causes difficulties achieving high learning accuracy. Several methods based on discard-after-learn concept have been proposed. Some methods were designed to cope with a single incoming datum but some were designed for a chunk of incoming data. Although the results of these approaches are rather impressive, most of them are based on temporally adding more neurons to learn new incoming data without any neuron merging process which can obviously increase the computational time and space complexities. Only online versatile elliptic basis function (VEBF) introduced neuron merging to reduce the space-time complexity of learning only a single incoming datum. This paper proposed a method for further enhancing the capability of discard-after-learn concept for streaming data-chunk environment in terms of low computational time and neural space complexities. A set of recursive functions for computing the relevant parameters of a new neuron, based on statistical confidence interval, was introduced. The newly proposed method, named streaming chunk incremental learning (SCIL), increases the plasticity and the adaptabilty of the network structure according to the distribution of incoming data and their classes. When being compared to the others in incremental-like manner, based on 11 benchmarked data sets of 150 to 581,012 samples with attributes ranging from 4 to 1,558 formed as streaming data, the proposed SCIL gave better accuracy and time in most data sets.
Highlights
Some data sets could not be learned by chunk incremental linear discriminant analysis (CILDA) and robust incremental learning methods (RIL) because of a singularity problem when solving for the weight matrix
The learning time of Stream Chunk Incremental Learning (SCIL) is slightly lower than the time of the RIL method
This paper presented the Stream Class-wise Incremental Learning (SCIL) algorithm for a versatile elliptic basis function neural network (VEBFNN) to handle the stream of data chunks
Summary
Fast analysis and management of huge amounts of data by the neural computing approach is a challenging problem for current competitiveness in many research fields such as science [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18], engineering [19, 20], medicine [21,22,23], social science [24,25,26], and business [27,28,29,30]. The speed of data generated on the internet per unit time is tremendously faster than the possible number of bits fabricated in a very-large-scale integration (VLSI) memory chip. Fast incremental learning with low structural complexity for class-wise data stream classification https://archive.ics.uci.edu/ml/datasets/Liver +Disorders 8) Spambase https://archive.ics.uci. Edu/ml/datasets/Spambase 9) Internet advertisement https://archive.ics.uci.edu/ml/ datasets/Internet+Advertisements 10) MinibooNE particle https://archive.ics.uci.edu/ml/datasets/ MiniBooNE+particle+identification. Additional data are available in the Supporting Information file S1 Dataset
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.