Prototype selection for interpretable classification

Jacob Bien,Robert Tibshirani

doi:10.1214/11-aoas495

Abstract

Prototype methods seek a minimal subset of samples that can serve as a distillation or condensed view of a data set. As the size of modern data sets grows, being able to present a domain specialist with a short list of "representative" samples chosen from the data set is of increasing interpretative value. While much recent statistical research has been focused on producing sparse-in-the-variables methods, this paper aims at achieving sparsity in the samples. We discuss a method for selecting prototypes in the classification setting (in which the samples fall into known discrete categories). Our method of focus is derived from three basic properties that we believe a good prototype set should satisfy. This intuition is translated into a set cover optimization problem, which we solve approximately using standard approaches. While prototype selection is usually viewed as purely a means toward building an efficient classifier, in this paper we emphasize the inherent value of having a set of prototypical elements. That said, by using the nearest-neighbor rule on the set of prototypes, we can of course discuss our method as a classifier as well.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The Annals of Applied Statistics	Publication Date: Dec 1, 2011
Citations: 191	License type: implied-oa

R Discovery Prime

R Discovery Prime

Prototype selection for interpretable classification

Abstract

Talk to us

Similar Papers

More From: The Annals of Applied Statistics

Lead the way for us

Similar Papers

Prototype Selection for Classification in Standard and Generalized Dissimilarity Spaces

-

24 Sep 2015
24 Sep 2015

Protein Identification False Discovery Rates for Very Large Proteomics Data Sets Generated by Tandem Mass Spectrometry
Lukas Reiter ... Ruedi Aebersold
Molecular & Cellular Proteomics | VOL. 8
Lukas Reiter, et. al.Lukas Reiter ... Ruedi Aebersold
01 Nov 2009
Molecular & Cellular Proteomics | VOL. 8

Prototype Extraction for k-NN Classifiers using Median Strings
Carlos D Martínez-Hinarejos ... Francisco Casacuberta
-
Carlos D Martínez-Hinarejos, et. al.Carlos D Martínez-Hinarejos ... Francisco Casacuberta
01 Jan 2003
01 Jan 2003

Beyond Traditional Kernels: Classification in Two Dissimilarity-Based Representation Spaces
Elzbieta Pekalska ... Robert P W Duin
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) | VOL. 38
Elzbieta Pekalska, et. al.Elzbieta Pekalska ... Robert P W Duin
01 Nov 2008
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prototype selection for interpretable classification

Abstract

Talk to us

Similar Papers

More From: The Annals of Applied Statistics