Abstract
Data are widely collected via the IoT for machine learning tasks in in-home health monitoring applications and mislabeled training data lead to unreliable machine learning models in in-home health monitoring. Researchers have proposed a wide arrangement of algorithms to deal with mislabeled training data, in which one straightforward and effective solution is to directly filter noise from training data so that the negative effects of mislabeled data can be minimized. In essence, noise filtering might be a suboptimal solution because the mislabeled data are not completely useless. The features and distributions of mislabeled data are still useful for learning, especially when training data are insufficient. In this work, we propose a novel framework to learn from mislabeled training data through ambiguous learning (LeMAL). LeMAL mainly consists of two parts. First, it converts the original training data to ambiguous data. Second, an ambiguous learning algorithm is applied to the ambiguous data. In this work, we propose a novel distance-based ambiguous learning algorithm so that the ambiguous data can be used in a better way. Finally, we demonstrate that LeMAL can effectively improve learning performance over existing noise filtering methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have