Interpretable machine learning and pattern mining via mathematical optimization

Ruilin Ouyang

doi:10.17760/d20409215

Abstract

With the development of artificial intelligence and big data technologies, machine learning has proven its success in various applications in the recent decade. However, there is a dilemma that one usually faces while choosing the proper machine learning model: the balance/trade-off between the model interpretability and performance in terms of accuracy. Simple and interpretable models usually lead to sacrificing performance in terms of accuracy such as decision tree, simple linear regression, etc. On the other hand, complicated models such as neural network and kernel based support vector machine (SVM) generally return results with high accuracy, however, these models are hard to interpret and barely provide insights of data. The issue mentioned above motivates the theme of this dissertation, which is to develop critical pattern recognition techniques in order to improve the performance of interpretable machine learning models via mathematical optimization techniques and apply them to real-life data-driven applications. In the first part, we investigate the logical analysis of data (LAD) which is a data mining methodology that combines ideas and concepts from optimization, combinatorics and Boolean functions. We present a new integrated optimization model and a greedy algorithm for generating patterns, directly derived from original data instead of binarized data, in LAD. In the second part, we study the sequential pattern generation for time-series data. We propose two optimization models and randomized search algorithms to optimize a set of logical sequential patterns with a maximum coverage of samples in target class based on the concept of LAD. In the first scenario, we consider critical patterns appear synchronously in the time horizon among all observations and in the second scenario, we consider such patterns appear asynchronously. In the third part, we focus on pattern generation for time-series data based on the concept of dynamic time warping (DTW). As massive data are generated in time, various computational methods have been proposed to deal with time-series data. Among them, DTW has been considered an effective approach for measuring similarity between time-series (or temporal) data which may vary in speed and sequence length. We present new mathematical programming formulations and solution approaches for DTW and sequential pattern generation based on the DTW concept.--Author's abstract

Full Text