Abstract

The double parallel cloud computing framework based on GPU (Graphics Processing Unit) and MapReduce is proposed, which aims at the low efficiency for the large data sets on the stand-alone by text classification algorithm, constructs the adaptive computation process of double parallel computing and combines the advantage of improved TFIDF (term frequency-inverse document frequency) algorithm, and improves TFIDF text categorization algorithm with double parallel adaptive computing. In different operating environments, the efficiency of improved TFIDF algorithm will be compared with different computing nodes. The result shows that massive data can be processed effectively in high speed by improved TFIDF algorithm which adopts double parallel adaptive computing. With the number of nodes increasing, the algorithm execution efficiency with double parallel adaptive computing is getting more and more effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call