Abstract
Mixed dish, which mixes different types of dishes in one plate, is a popular kind of food in East and Southeast Asia. Identifying the dish type in the mixed dish is essential for dietary tracking, which gains increasing research attention recently. Nevertheless, mixed dish detection is a challenging task because of large visual variances among dishes in different canteens, which is known as the domain shifting problem. Since collecting and annotating sufficient training samples in each canteen for model training is difficult, a more practical way is developing detection models that can adapt quickly to cross-canteen mixed-dish detection with less supervision information. To this end, we propose a novel framework called Weakly-supervised Mean Teacher Network (WMT-Net) that addresses this specific detection task in a weakly supervised manner, where bounding box annotations are not required in the target domain. The proposed WMT-Net constructs Mean Teacher learning by maintaining the image-level consistency between teacher and student modules. Specifically, WMT-Net firstly learns instance-level information from the source dataset in a fully supervised fashion for the student model. Then the whole architecture is optimized with weakly supervised learning: 1) weakly supervised training in student model to reduce the domain gap in global semantics between source data and target data, 2) image-level consistency to align the image-level predictions between teacher model and student model. Experimental results on mixed-dish dataset show that even the proposed WMT-Net is trained in a weakly supervised fashion on the target domain, the performances attained by WMT-Net are very close to the model trained in a fully supervised fashion, which verify the effectiveness of WMT-Net. In addition, the proposed WMT-Net also achieves 44.6% mAP on Pascal VOC to Clipart cross-domain detection, which improves 7.2% mAP compared with the state-of-the-arts method and further demonstrates its generalization capabilities.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.