We present an interactive spreadsheet that supports teaching essential concepts in classification using the logistic regression (LoR) model for binary classification. The interactive spreadsheet demonstrates the capabilities of LoR by integrating computation with visualization. Students will reinforce concepts like probabilities, maximum likelihood estimation (MLE), and the use of likelihoods to optimize parameters for the LoR. We then discuss using LoR for classifications while adjusting its decision boundary (DB), demonstrating how to convert assigned likelihoods into classification using the DB; impact classification outcome by varying DBs; designate predictions as true positive, true negative, false positive, or false negative; and determine the classification accuracy. We use a variety of performance measures, including sensitivity, specificity, precision, negative predictive value, F1 and F2 scores, the receiver operating characteristics curve, and lift/decile charts. These measures are dynamically adjusted when the DB changes. We also reiterate the usage of these measures in the context of crossvalidation and imbalanced data sets. We provide a case study that implements LoR and an option for teaching the details behind MLE. We discuss the pedagogical aspects of this spreadsheet based on a survey of the 2022 student cohort in the Master of Management Analytics Program at the Rotman School of Management.
Read full abstract