Abstract

This paper proposes a regression model based on Logical Analysis of Data (LAD). LAD is known as a combinatorial Boolean supervised data mining technique for pattern generation. It is used mainly for classification problems, and has demonstrated high accuracy compared to other classification techniques. In this paper, we extend the use of LAD to deal with supervised data with continuous responses. We derive a LAD regression model (LADR). Three discretization methods that transform the values of the response into a set of thresholds are tested. At each threshold, LAD analyzes the data as a two-class classification problem and extracts the prescriptive patterns for each class. LADR regression uses the generated patterns from the original data by using cbmLAD software to fit a numerical continuous dependent response. Therefore, a normalized regression model with only binary independent variables is obtained. LADR has been applied for six datasets and obtains better results compared with the linear regression (LR), support vector regression (SVR), Decision Tree Regression (DTR), Random Forest (RF), and Polynomial Regression (PolyR). The performance is evaluated by the Mean Square Error (MSE), Coefficient of Determination (R2), and Mean Absolute Error (MAE) based on a 10-fold cross validation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call