Abstract

Big Data is becoming more and more significant these years since our daily life is facing huge number of data as the millions of electronic devices. Big Data is not only with the huge volume or size, but also with the high complexity. This paper presents a multi-dimensional matrix model for analyzing the large text datasets based on the attributes, which come from the key words from the texts. These key words form an N dimensional space. Thus, the individual information could be presented by an M×N matrix. The multi-dimensional matrix approach has been compared with GA and PSO algorithm so as to test the efficiency and effectiveness of different approaches on analyzing the text datasets. From the experiments, it is observed that the proposed approach outperforms GA and PSO in sufficiency and computational cost. Some key findings are: For high dimensional Big Text Data, at the beginning, PSO has the best sufficiency from 0 to 10. After that, from 10 to 1000, the prosed multi-dimensional matrix approach significantly outperforms GA and PSO. For Connect-4 data samples, the time cost of proposed approach is only 352153.6 unit of time, while GA takes 613601.4 which is more of about half the time cost and PSO takes 469464.1.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call