Traditional linear dimension reduction methods such as Principal component analysis (PCA) and linear discriminant analysis (LDA) have been used in many application areas due to simplicity and high performance. However, in data streams where data instances are generated continuously over time, it is difficult to apply traditional PCA or LDA. Moreover, data streams can have drifting concepts over time. In this paper, we compared several incremental linear dimension reduction algorithms which can be applied for classification in streaming data. Also, the performance comparison for prediction accuracy and time complexity was conducted in various streaming environments such as low dimensional data streams, high dimensional data streams, and data streams with concept drifts. Experimental results showed that incremental least squares formulation (ILS) combined with incremental PCA can be used effectively for classification in streaming data.
Read full abstract