Abstract

Estimation of an algorithm’s performance given a particular input image/video is difficult for a computer but quite easy for a human observer. Humans can assess the ability of an algorithm to generate a positive outcome for a given input based on small number of observations since they can learn high level cognitive association from input-output observations. Simulation of this process can lead to automation of algorithm performance assessment and thus eventual prediction of algorithm performance given an input image. In any computer vision system, predicting the performance of a vision algorithm can prove valuable in optimizing the overall success of the application. In this paper, we propose a framework for predicting the performance of a vision algorithm given the input image or video so as to maximize the algorithm’s ability to provide the desired output. This is achieved by modeling the performance prediction process as one that accounts for the algorithm’s behavioral properties as well as the quality of the algorithm’s input. A prototype system is designed for optimal prediction of vision algorithm’s performance given the inputs image/video quality and its application to an optimal algorithm selection process is demonstrated. This system can be considered an intelligent system that combines algorithm’s input’s quality with knowledge based prediction of algorithm performance. Performance evaluation is used to obtain knowledge about algorithm’s behavioral properties [3] and intelligent system design is used to apply that knowledge. The quality of a frame is expressed as a function or combination of functions of image features that represent degradations present in the video frames. Algorithm’s performance is characterized by performance metrics that capture its ability to provide a desired result. The framework shown in figure 1 contains three main modules, quality extractor, performance evaluator, and predictor. The quality extractor is meant to quantify the degradations in the input image that affect the vision algorithm’s performance. The performance evaluator, evaluates the algorithm’s performance on a set of training input data with varying levels of image quality in order to capture its behavioral properties. In essence, the performance evaluator simulates the algorithm’s perception of the input image. Finally, the predictor combines information from quality extractor and performance evaluator. It provides a mechanism to automatically acquire, store and utilize knowledge about algorithm’s perception of image quality. The framework is prototyped using 3 object tracking algorithms used in video surveillance applications. During the training phase, the input video is processed using each algorithm and performance evaluation on the results provides the performance metrics. The performance metric represents the quality of algorithm’s performance associated with the input. Image quality measures are also calculated for every input frame in the video. Both image quality measure and performance measures are used to design the predictor. This allows for representation and capture of needed knowledge. When the predictor encounters a new input video, the knowledge learnt during training and the image quality are used to predict each algorithm’s ability to succeed, without actual algorithm execution. The one expected to achieve maximum success is selected. The signal activity measure proposed in [4], edge entropy and structural similarity index metric (SSIM) proposed in [5] are used to quantify image quality. Multiple Object Tracker performance evaluation is used to quantify a tracker’s performance in terms of performance metrics. A systematic and objective performance evaluation of the tracker’s characteristics proposed in [2] is used for this purpose. The tracking algorithms used are Uniform Motion Connected Component tracking, Mean Shift tracking, and Particle filter tracking. In order to observe and learn the effect of the degradations in input data on the algorithm’s performance, we use 12 videos from the INRIA data set [1]. Table 1 shows the performance of each individual tracker and the performance of the tracker selection based on prediction on the 12 test sequences. Labels CC, MS and PF represent the Connected Component Figure 1: Performance Prediction Learning Framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call