Ensembled Approach to Heterogeneous Data Streams

Lalit Agrawal Lalit Agrawal,Dattatraya Adane Dattatraya Adane

doi:10.47164/ijngc.v13i5.901

Lalit Agrawal Lalit Agrawal, Dattatraya Adane Dattatraya Adane

Open Access

https://doi.org/10.47164/ijngc.v13i5.901

Copy DOI

Abstract

Principal component analysis-based decision tree forest (PDTF) can improve the variety in base classifiers while generating the forest of decision trees. All the trees in the forest have a very low correlation. In this research work, an algorithm is proposed to select the important features from the original data by applying them to the PDTF algorithm and then the selected features are used with long and short-term memory (LSTM) networks for improving the classification accuracy of heterogeneous data streams. This reduces the load on the active classification system and improves the per record classification time. In addition to thirty-five different datasets, Indian National stock exchange data feeds are used for experimentation. This real-time data feed is used as a base for calculating the values of twenty-five technical indicators. Technical indicators statistically forecast the market movement. Since the movement of stock is not only governed by its past values and it simply cannot be predicted with technical indicators alone. Therefore, heterogeneous data related to various domains that could probably impact the performance of the market is also considered. This approach is evaluated against the benchmark methods against a total of thirty-five datasets and livestock feeds and from the results, it is evident that this approach is better than previously used approaches.

Full Text