The performance of the semantic concept detection method depends on, the selection of the low-level visual features used to represent key-frames of a shot and the selection of the feature-fusion method used. This paper proposes a set of low-level visual features of considerably smaller size and also proposes novel ‘hybrid-fusion’ and ‘mixed-hybrid-fusion’, approaches which are formulated by combining early and late-fusion strategies proposed in the literature. In the initially proposed hybrid-fusion approach, the features from the same feature group are combined using early-fusion before classifier training; and the concept probability scores from multiple classifiers are merged using late-fusion approach to get final detection scores. A feature group is defined as the features from the same feature family such as color moment. The hybrid-fusion approach is refined and the “mixed-hybrid-fusion” approach is proposed to further improve detection rate. This paper presents a novel video concept detection system for multi-label data using a proposed mixed-hybrid-fusion approach. Support Vector Machine (SVM) is used to build classifiers that produce concept probabilities for a test frame. The proposed approaches are evaluated on multi-label TRECVID2007 development dataset. Experimental results show that, the proposed mixed-hybrid-fusion approach performs better than other proposed hybrid-fusion approach and outperforms all conventional early-fusion and late-fusion approaches by large margins with respect to feature set dimensionality and Mean Average Precision (MAP) values.
Read full abstract