Abstract

Label distribution learning (LDL), focusing on the relative importance of different labels to the instance, is proposed for solving label ambiguity problem in recent years. However, for label distribution data, the annotation information may be incomplete in the real-world and complete methods cannot be directly used to process these data. In addition, with the exponential growth of data volume, data in all walks of life tend to be high-dimensional, feature selection as an efficient preprocessing technique to reduce the dimension of data. Taking the problems of the incomplete label and high-dimensional data into consideration, an incomplete label distribution feature selection method based on neighborhood-tolerance discrimination index is proposed. The neighborhood-tolerance discrimination index is utilized to explore the distinguishing ability of the feature subset, and then a novel significance metric is constructed to evaluate the importance of features, which considers the correlations between features and labels. Compared with multi-label feature selection algorithms, the proposed algorithm is designed to directly process label distribution data without discretization, which reduces information loss in the process of discretization. Compared to existing label distribution feature selection algorithms, the proposed algorithm can directly process distribution data with missing label, which avoids the interference of noisy information. Furthermore, the superiority of the proposed method over other seven state-of-the-art methods is demonstrated by conducting comprehensive experiments with eight publicly available label distribution datasets on six widely-used metrics. The experimental results show that the proposed algorithm obtains superior performance in 91.67% of cases against compared algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.