Partial label learning refers to the issue that each training sample corresponds to a candidate label set containing only one valid label. Feature selection can be viewed as an effective pre-processing technology to improve the generalization performance of learning models, while the partial label feature selection task is challenging due to ambiguous labeling information. To this end, this paper utilizes granular ball computing and neighborhood rough sets to put forward a disambiguation-based partial label feature selection algorithm via feature dependency and label consistency. Firstly, the proposed algorithm performs an adaptive neighbor aggregation based on granular balls to disambiguate candidate labels. Adaptive neighbors are flexible to make more efficient learning performance. Then, considering the labeling confidence induced by disambiguation, the significance of each feature is evaluated by fusing the feature dependency from the neighborhood granularity and the label consistency among the nearest neighbors. Neighborhood rough sets directly handle continuous features that can reduce the adverse impact of data discretization on feature selection. The label consistency among adjacent samples is important for measuring feature discrimination. The fusion of feature dependency and label consistency can facilitate performance improvement. Finally, experiments conducted on eight controlled UCI and five real-world partial label datasets demonstrate that the proposed algorithm can improve the generalization performance of partial label learning and achieve superior performance compared to the partial label feature selection methods.
Read full abstract