Integrated optimization model and algorithm for pattern generation and selection in logical analysis of data

Ruilin Ouyang,Chun-An Chou

doi:10.1016/j.cor.2020.105049

Abstract

In this paper, we present a new integrated optimization model and a greedy algorithm for generating patterns, directly derived from original data instead of binarized data, in logical analysis of data (LAD). Pattern generation, following data discretization (binarization) and support set selection to handle non-binary data, is a building block that largely influences LAD classification. These stand-alone steps are generally considered optimization problems, which are difficult to solve and make the LAD procedure very tedious. To this end, we propose a new mixed-integer linear program, in which data discretization and support set selection are integrated into a single pattern generation optimization model, aiming to generate multiple logical patterns to cover observations maximally in the original data space. Furthermore, we develop a greedy search algorithm, in which the optimization model is reduced and solved iteratively to efficiently generate patterns. We then examine the effectiveness of the generated patterns in both one-class and large-margin LAD classifiers. The computational results for simulated and real datasets demonstrate the competitive performance in terms of classification accuracy in a relatively short runtime compared with previously developed pattern generation methods and other state-of-the-art machine learning algorithms.

Full Text