Abstract

Finding connected components in a graph is a well-known problem in a wide variety of application areas such as social network analysis, data mining, image processing, and etc. In this paper, we present an efficient and scalable approach in MapReduce to find all the connected components in a given graph. We compare our approach with the state-of-the-art on a real-world graph. We also demonstrate the viability of our approach on a massive graph with ~6B nodes and ~92B edges on an 80-node hadoop cluster. To the best of our knowledge, this is the largest graph publicly used in such an experiment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call