Abstract

We focus, in this paper, on the computational challenges of identifying disjunctive Boolean patterns in high-dimensional data. We conduct our analysis focusing particularly in microarray gene expression data, since this is one of the most stereotypical examples of high-dimensional data. We devised a novel algorithm that takes advantage of the scarcity of samples in microarray data sets, allowing us to efficiently find disjunctive closed patterns. Our algorithm, Disclosed, mines disjunctive closed itemsets by exploring the search space in a depth-first, top-down manner.We evaluated the performance of our algorithm to execute such a task using real microarray gene expression data sets publicly available on the Internet. Our experiments revealed under what situations, the characteristics of a data set, our method obtain a good, bad or average performance. We also compared the performance of our method with the state of the art algorithms for finding disjunctive closed patterns and disjunctive minimal generators. We observed that our approach is two orders of magnitude more efficient, both in terms of time and memory.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call