Abstract

Searching for interesting patterns in large data sources plays a vital part in Data Science. There are several techniques, which can be applied to data sources to extract the patterns that are required. Formal Concept Analysis (FCA) is one such mathematical method for data analysis with flourishing popularity across distinct domains, used to find the interesting patterns in binary matrices. Several algorithms were designed for computing patterns in binary matrices, but their computational complexity doesn't fit for large datasets. This Paper addresses a proficient concept analysis algorithm for finding neighbors and constructing a concept lattice using MapReduce framework for a given dense formal contexts of high dimensionality in terms of number of objects by modifying Lindig's upper neighbor algorithm for generating concept lattice. The new MapReduce algorithm namely UNConceptGeneration algorithm is implemented using map and Reduce processes to generate formal concepts to construct the concept lattice. The Map process distributes the dense context across various map tasks and computes the concepts, reduce process aggregates all the generated concepts to produce the final outcome. This paper also addresses how the network delays and data transfer rate overcome during the concept lattice generation, which leads to faster concept analysis. The paper describes Hadoop's fault tolerance which is one of the key features to choose MapReduce to implement this algorithm in a distributed environment. The experimental results demonstrate that the proposed approach performs comparable or better than any other parallel algorithms that were tested for sufficiently large datasets. A detailed comparison with other existing parallel and traditional algorithms is also depicted.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call