Abstract

Formal Concept Analysis (FCA) is an applied mathematical technique for data analysis, in which the relations between objects and attributes are identified. It introduces the notion of concepts and their hierarchical structure, from which we can obtain a set of implications between attributes that characterize a knowledge domain. The volume of information to be processed makes the use of FCA difficult in domains with a high number of dimensions, creating a demand for new solutions and algorithms for FCA applications. This article explores different approaches to extract proper implications from high dimensional contexts based on constraints to obtain the set of implications rules. We propose algorithms that use a data structure called Binary Decision Diagram (BDD) to represent the formal context, which reduces its size and, due to this, operates more efficiently. We also propose a heuristic to obtain proper implications by reducing the unnecessary generation of premises. In addition, we implemented a parallel computing model for generating and obtaining different implications. To analyze the proposed algorithms, we used different synthetic contexts with a varying number of objects, attributes, and density. The results obtained presented speed gains of up to 22 times when compared to the solutions proposed in the literature such as Impec and PropIm.

Highlights

  • Extracting knowledge from large volumes of data collected and stored currently is unfeasible without the support of automation and techniques devised for this purpose

  • Considering a high dimensional context, with a large number of objects, in which the main goal is to find proper implications with support greater than 0, and given a subset of desirable conclusions to describe a specific domain, we propose two algorithms based on the original proposal of [18]: I) the first algorithm, ImplicP, contains a heuristic to avoid the generation of unnecessary premise sets, evaluated based on the notion of monotonic constraints; II) the second algorithm, named PImplicPBDD, includes a Binary Decision Diagram (BDD) structure to represent and manipulate the formal context efficiently, and a parallel computing model to process several conclusions simultaneously

  • FORMAL CONCEPT ANALYSIS As discussed previously, Formal Concept Analysis (FCA) is a field of mathematics that allows the identification of formal concepts and implications, which are extracted from a formal context [2]

Read more

Summary

INTRODUCTION

Extracting knowledge from large volumes of data collected and stored currently is unfeasible without the support of automation and techniques devised for this purpose. Considering a high dimensional context, with a large number of objects (up to 100,000), in which the main goal is to find proper implications with support greater than 0, and given a subset of desirable conclusions to describe a specific domain, we propose two algorithms based on the original proposal of [18]: I) the first algorithm, ImplicP, contains a heuristic to avoid the generation of unnecessary premise sets, evaluated based on the notion of monotonic constraints; II) the second algorithm, named PImplicPBDD, includes a Binary Decision Diagram (BDD) structure to represent and manipulate the formal context efficiently, and a parallel computing model to process several conclusions simultaneously.

RELATED WORK
BACKGROUND
Findings
CONCLUSION AND FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call