A Big Data Analytics based on Multi-dimensional Matrix for Large Text Datasets

Fan Linxiu

doi:10.14257/ijca.2016.9.4.23

Abstract

Big Data is becoming more and more significant these years since our daily life is facing huge number of data as the millions of electronic devices. Big Data is not only with the huge volume or size, but also with the high complexity. This paper presents a multi-dimensional matrix model for analyzing the large text datasets based on the attributes, which come from the key words from the texts. These key words form an N dimensional space. Thus, the individual information could be presented by an M×N matrix. The multi-dimensional matrix approach has been compared with GA and PSO algorithm so as to test the efficiency and effectiveness of different approaches on analyzing the text datasets. From the experiments, it is observed that the proposed approach outperforms GA and PSO in sufficiency and computational cost. Some key findings are: For high dimensional Big Text Data, at the beginning, PSO has the best sufficiency from 0 to 10. After that, from 10 to 1000, the prosed multi-dimensional matrix approach significantly outperforms GA and PSO. For Connect-4 data samples, the time cost of proposed approach is only 352153.6 unit of time, while GA takes 613601.4 which is more of about half the time cost and PSO takes 469464.1.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Big Data Analytics based on Multi-dimensional Matrix for Large Text Datasets

Abstract

Talk to us

Similar Papers

More From: International Journal of Control and Automation

Lead the way for us

Journal: International Journal of Control and Automation	Publication Date: Apr 30, 2016
Citations: 14

Similar Papers

Sparse Kernel Clustering of Massive High-Dimensional Data sets with Large Number of Clusters
Radha Chitta ... Anil K Jain
-
Radha Chitta, et. al.Radha Chitta ... Anil K Jain
18 Oct 2015
18 Oct 2015

Generative Maximum Entropy Learning for Multiclass Classification
Ambedkar Dukkipati ... Paramita Koley
-
Ambedkar Dukkipati, et. al.Ambedkar Dukkipati ... Paramita Koley
01 Dec 2013
01 Dec 2013

A scalable and dynamic self-organizing map for clustering large volumes of text data
Sumith Matharage ... Damminda Alahakoon
-
Sumith Matharage, et. al.Sumith Matharage ... Damminda Alahakoon
01 Aug 2013
01 Aug 2013

Machine learning for Big Data analytics in plants.
Chuang Ma ... Hao Helen Zhang
Trends in Plant Science | VOL. 19
Chuang Ma, et. al.Chuang Ma ... Hao Helen Zhang
14 Sep 2014
Trends in Plant Science | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Big Data Analytics based on Multi-dimensional Matrix for Large Text Datasets

Abstract

Talk to us

Similar Papers

More From: International Journal of Control and Automation