Abstract

A good description of a class should be (reasonably) accurate and interpretable. Previous works address this class-description problem by either analyzing the correlation of each attribute with the class, or by producing rules as in building a classifier. These solutions suffer from issues in accuracy and interpretability.A sentence is usually defined as a disjunction or conjunction of several terms, each of which specifies a constraint (range/set of values) on an attribute. From the data analysis point of view, a sentence specifies a subspace in the database. In this paper, we create a richer yet interpretable form of a sentence. Here, a sentence describes an object if any k attributes of that object satisfy the specified constraints, or in other words, the object is partially covered by the subspace. Since this simple enhancement subsumes rules used in previous solutions, descriptions based on such sentences are provably better.To that end, we design Pub, an algorithm that produces descriptions with our form of sentences. Theoretically, while constructing a sentence (within the description), Pub finds the optimal range/set of values for each attribute in linear time. Empirically, we show that Pub is efficient, and able to produce more accurate, concise and interpretable descriptions than current approaches on various real datasets. We also perform an illustrative case study on the Glass dataset, providing some useful insights.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.