Abstract

We provide a uniform, general, and complete formal account of evaluation metrics for ranking, classification, clustering, and other information access problems. We leverage concepts from measurement theory, such as scale types and permissible transformation functions, and we capture the nature of evaluation metrics in many tasks by two formal definitions, which lead to a distinction of two metric/tasks families, and provide a comprehensive classification of the tasks that have been proposed so far. We derive some theorems to analyze the suitability (or otherwise) of some common metrics. Within our model we can derive and explain the theoretical properties and drawbacks of the state of the art metrics for multiple tasks. The main contributions of this paper are that, differently from previous studies, the formalization is well grounded on a solid discipline, it is general as it can take into account most effectiveness metrics as well as most existing tasks, and it allows to derive important consequences on metrics and their limitations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.