Abstract

AbstractArtificial intelligence using neural networks has made tremendous progress in the field of computer vision. State-of-the-art models have been developed for various computer vision tasks such as image classification, object detection, image segmentation and keypoint estimation. Many of these models are tuned to get highest benchmark scores on curated datasets for any single specific task. However, deploying them to industrial use cases often requires multiple models to be used sequentially or as an ensemble to address different business use cases. Better hardware, model optimisation techniques such as quantisation and pruning have been widely used to improve the performance of individual models. In most cases, the features extracted by a model is not effectively used by the subsequent models. Each model will have its own pre- and post-processing which is a considerable overhead when scaled to industry requirements. We explore the concept of multitasking architectures and propose a joint learning approach to train a multitasking model that can do object detection, keypoint estimation and instance segmentation together using a single forward pass through it. Learning to predict multiple closely related tasks should help the model to learn better representations of the trained data and become robust to overfitting. Our best performing model achieved 32.26 frames per second (fps) with 41.2 AP on object detection, 38.2 AP on instance segmentation and 53.0 AP on keypoint estimation tasks when evaluated on COCO validation dataset. A lighter version of the model was able to process at 41.66 fps, enabling real-time computations for most use cases.KeywordsJoint learningMultitaskingObject detectionInstance segmentationKeypoint estimation

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.