Abstract
Multi-instance learning (MIL) plays an important role in many real applications, such as image recognition and text classification. The instance-based approach selects instances in each bag to train and has drawn significant attention recently. However, less work took the distribution information in the account and the margin distribution has been proven to be important to the generalization performance. In this paper, we propose an optimal representative distribution margin approach for multi-instance learning (MIORDM). The representative instances are the samples from the instance space and the distribution of them is important for us to find the best separation hyperplane. As the representative instances are selected iteratively, in each iteration, the instances will be more precise by the best hyperplane and the model will be better in the next iteration. In this way, a well-performed method can be derived with better generalization performance. Experiments compared with other types of state-of-the-art approaches on different datasets show that our method outperforms the others and achieves better generalization performance.
Highlights
In generic supervised learning, a class label will be annotated to each training instance
Margin information is taken seriously, but we do not know how much we should concentrate on it. If we compare this with Support Vector Machine (SVM), the λ is similar to the C in SVM
We found that distribution information seldom anticipated the classifier training process
Summary
A class label will be annotated to each training instance. We turn to do research on multi-instance learning (MIL), which is one kind of weakly supervised learning that treats a set of instances as a bag and labels the bag instead. In this way, we can reduce the annotation cost. If there exists at least one positive instance in a bag, the label of the bag will be positive, otherwise, it will be negative [1] Owing to this advantage for simplify the annotation cost, multi-instance learning becomes one of the most popular domain [2]–[5], and this approach has been widely applied in content-based image recognition [6], [7], text classification [8], sign language recognition [9], and so on. We can derive a standard multi-instance task of it
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.