Abstract

Background: As massive data acquisition and storage becomes increasingly affordable, a wide variety of enterprises are engaged in sophisticated data analysis. The amount of digital information which is majorly unstructured produced is exceeding day by day. Method: MapReduce programming method is easily applicable to many different learning algorithms. Machine Learning is at the core of data analysis. Traditional machine learning algorithms speed up at a time to fit the statistical query model on multicore computers. Data Sharing is avoided by Hadoop whereas Machine learning Algorithm needs data to be stored in single place. The method compares the machine learning algorithms on MapReduce paradigm for evaluating speed. Findings: MapReduce programming model enables easy development of scalable parallel applications to process large clusters of data. Hadoop Distributed File System runs the MapReduce jobs which influence the performance significantly while handling huge data set stored on different nodes of a multi node cluster. This paper analyses on developing machine learning algorithms on Hadoop to process large clusters of data. Analyzing logistic regression algorithms on MapReduce for evaluating the performance to speed up processing by developing a cost model. The attributes of the system are evaluated for improving time efficiency. The objective is to provide ad hoc performance for MapReduce programs which run on large data sets. Improvements: A method for optimizing job assignment on machine learning is implemented in order to minimize the total execution time. This feature improves the productivity of MapReduce users to optimize performance efficiency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.