Abstract

Hadoop is the main platform for big data mining. It can build Yarn, Mahout, Storm, Graph Lab and other frameworks on its platform. Spark framework is a common task scheduling framework, which not only has powerful processing capabilities, but also has Data processing timeliness, which integrates functions such as machine learning, graph calculation, and online learning, adopts a unified processing mechanism, and the speed is dozens or even hundreds of times that of traditional data processing methods. The paper establishes a Spark framework based on the Hadoop platform, and verifies that the Spark framework’s processing mechanism for big data can meet the speed requirements of the increasingly value-added big data applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call