Abstract

Data analytics often make sense of large data sets by generalization: aggregating from the detailed data to a more general context. Given a dataset, misleading generalizations can sometimes be drawn from a cherry-picked level of aggregation to obscure substantial subgroups that oppose the generalization. Our goal is to detect and explain cherry-picked generalizations by refining the corresponding aggregate queries. We demonstrate OREO, a system to compute a support score of the given statement to quantify the quality of the generalization; that is, whether the aggregated result is an accurate reflection of the data. To better understand the resulting score, our system also identifies significant counterexamples and alternative statements that better represent the data at hand. We will demonstrate the utility of OREO for investigating generalizations, by interacting with the VLDB'22 participants who will use the OREO interface for statement validation and explanation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.