Abstract
Building models that are both interpretable and accurate is an unresolved challenge for many pattern recognition problems. In general, rule-based and linear models lack accuracy, while deep learning interpretability is based on rough approximations of the underlying inference. However, recently, the rule-based Tsetlin Machines (TMs) have obtained competitive performance in terms of accuracy, memory footprint, and inference speed on diverse benchmarks (image classification, regression, natural language understanding, and game-playing). TMs construct rules using human-interpretable conjunctive clauses in propositional logic. These, in turn, are combined linearly to solve complex pattern recognition tasks. This paper addresses the accuracy-interpretability challenge in machine learning by introducing a TM with integer weighted clauses - the Integer Weighted TM (IWTM). The intent is to increase TM interpretability by reducing the number of clauses required for competitive performance. The IWTM achieves this by weighting the clauses so that a single clause can replace multiple duplicates. Since each TM clause is formed adaptively by a Tsetlin Automata (TA) team, identifying effective weights becomes a challenging online learning problem. We solve this problem by extending each team of TA with another kind of automaton: the stochastic searching on the line (SSL) automaton. We evaluate the performance of the new scheme empirically using five datasets, along with a study of interpretability. On average, IWTM uses 6.5 times fewer literals than the vanilla TM and 120 times fewer literals than a TM with real-valued weights. Furthermore, in terms of average memory usage and F1-Score, IWTM outperforms simple Multi-Layered Artificial Neural Networks, Decision Trees, Support Vector Machines, K-Nearest Neighbor, Random Forest, Gradient Boosted Trees (XGBoost), Explainable Boosting Machines (EBMs), as well as the standard and real-value weighted TMs. IWTM finally outperforms Neural Additive Models on Fraud Detection and StructureBoost on CA-58 in terms of Area Under Curve, while performing competitively on COMPAS.
Highlights
Interpretable Machine Learning refers to machine learning models that obtain transparency by providing the reasons behind their output
COMPUTATIONAL COMPLEXITY OF THE Integer Weighted Tsetlin Machine (TM) (IWTM) To evaluate computational complexity, we introduce the three constants α, β, and γ, where α represents the computational cost to perform the conjunction of two bits, β is the computational cost of computing the summation of two integers, and γ is the computational cost to update the state of a single automaton (TA or stochastic searching on the line (SSL)) in IWTM
In this paper, we proposed a novel Tsetlin Machine (TM) having integer weights attached to clauses, to address the accuracy-interpretability challenge in machine learning
Summary
Interpretable Machine Learning refers to machine learning models that obtain transparency by providing the reasons behind their output. We cover how the individual clauses are trained to learn sub-patterns and how the weight values are updated using SSL. Type I feedback: Type I feedback consists of two subfeedback schemes: Type Ia and Type Ib. Type Ia feedback reinforces include actions of TAs whose corresponding literal value is 1, only when the clause output is 1. Type Ib feedback combats over-fitting by reinforcing exclude actions of TAs when the corresponding literal is 0 or when the clause output is 0. The processing of the training example ends with the state matrix A+ being updated as follows: A+ ← A+ ⊕ I Ia I Ib. Type II feedback: Type II feedback is given to clauses with positive polarity for target output y = 0. Clauses can be turned completely off by setting their weights to 0 if they do not contribute positively to the classification task
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.