Abstract

The categories and quantity of data are expanding exponentially with the on-going wave of connectivity. A number of connected devices and data sources continuously generate a huge amount of data at a very high speed. This paper investigates various methods such as - Naive Bayes classifier, Very Fast Decision Trees (VFDT), ensemble methods, clustering based methods, etc. that have been used for streaming data processing. In this paper, recurrent neural network (RNN) is implemented topredict the next sequence of a data stream. Three types of sequential data streams are considered - uniform rectangular data, uniform sinusoidal data and non-uniform sinc pulse data. Various RNN architectures such as - simple RNN, RNN with long short term memory (LSTM), RNN with gated recurrent units (GRU) and RNN optimized with Genetic Algorithm (GA) are implemented for various combinations of number network hyper-parameters such as –number of hidden layers, number of neurons per layer, activation function and optimizer etc. The optimal combination of the hyper-parameters is selected using GA. With sample data streams, simple RNN shows better prediction accuracy than LSTM and GRU for single hidden layer architecture. As the RNN architectures get deeper, LSTM and GRU outperform simple RNN. The optimized version of RNN has been experimentally observed to be 78.13% faster than single layered LSTM architecture and 82.76% faster than the LSTM model with 4 hidden layers. The decline in accuracy is 8.67% and 12.67% respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call