Abstract

Concise and unambiguous assessment of a machine learning algorithm is key to classifier design and performance improvement. In the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multi-class</i> classification task, where each instance can only be labeled as one class, the confusion matrix is a powerful tool for performance assessment by quantifying the classification overlap. However, in the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multi-label</i> classification task, where each instance can be labeled with more than one class, the confusion matrix is undefined. Performance assessment of the multi-label classifier is currently based on calculating performance averages, such as hamming loss, precision, recall, and F-score. While the current assessment techniques present a reasonable representation of each class and overall performance, their aggregate nature results in ambiguity when identifying false negative ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FN</i> ) and false positive ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FP</i> ) results. To address this gap, we define a method of creating the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multi-label confusion matrix</i> (MLCM) based on three proposed categories of multi-label problems. After establishing the shortcomings of current methods for identifying <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FN</i> and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FP</i> , we demonstrate the usage of the MLCM with the classification of two publicly available multi-label data sets: i) a 12-lead ECG data set with nine classes, and ii) a movie poster data set with eighteen classes. A comparison of the MLCM results against statistics from the current techniques is presented to show the effectiveness in providing a concise and unambiguous understanding of a multi-label classifier behavior.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.