Abstract
Multiple-object tracking is affected by various sources of distortion, such as occlusion, illumination variations and motion changes. Overcoming these distortions by tracking on RGB frames, such as shifting, has limitations because of material distortions caused by RGB frames. To overcome these distortions, we propose a multiple-object fusion tracker (MOFT), which uses a combination of 3D point clouds and corresponding RGB frames. The MOFT uses a matching function initialized on large-scale external sequences to determine which candidates in the current frame match with the target object in the previous frame. After conducting tracking on a few frames, the initialized matching function is fine-tuned according to the appearance models of target objects. The fine-tuning process of the matching function is constructed as a structured form with diverse matching function branches. In general multiple object tracking situations, scale variations for a scene occur depending on the distance between the target objects and the sensors. If the target objects in various scales are equally represented with the same strategy, information losses will occur for any representation of the target objects. In this paper, the output map of the convolutional layer obtained from a pre-trained convolutional neural network is used to adaptively represent instances without information loss. In addition, MOFT fuses the tracking results obtained from each modality at the decision level to compensate the tracking failures of each modality using basic belief assignment, rather than fusing modalities by selectively using the features of each modality. Experimental results indicate that the proposed tracker provides state-of-the-art performance considering multiple objects tracking (MOT) and KITTIbenchmarks.
Highlights
Object tracking is an important task in various research areas, such as surveillance, sports analysis, human-computer interaction and autonomous driving systems
We evaluated the proposed method by comparing it with intra-varied multiple-object fusion tracker (MOFT) to validate the performance of our architecture choices for MOFT
The matching target experiment was conducted to validate the suitability of the matching between target objects in frame k and candidates in frame k + 1 for multiple objects tracking (MOT) tasks
Summary
Object tracking is an important task in various research areas, such as surveillance, sports analysis, human-computer interaction and autonomous driving systems. Various forms of tracking are being actively researched; these include multiple objects tracking (MOT), tracking using multiple sensors and model-free tracking. Despite much success, MOT techniques still face various challenges caused by illumination and scale changes, occlusion and other disturbance factors. Affine transformation [1], illumination invariance [2] and occlusion detection [3,4] have been widely applied to trackers to deal with disturbances. While trackers in which distortion handlers are embedded are able to overcome a specific disturbing factor, tracking may fail when other distortions are introduced. Another way to maintain tracking performance despite distortions is to adaptively train the appearance (and/or motion) model of the object tracker online.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.