Abstract

We investigate several problems in the annotation of video shots by semantic labels which are implicitly embedded in a semantic hierarchy, leading to analyses and novel methods for refining video ontologies and their ground truth. First, in the large 449 LSCOM semantic concept data set, we show that within the implicit ldquouse ontologyrdquo, many concepts tags are ambiguous as to purposeful activity, visual scope, or social agency, or are absent altogether, but that better ldquouse sensesrdquo can be refined algorithmically. Second, we find that both traditional hard and fuzzy k-medoid clustering techniques are inadequate for hierarchical concepts, but a novel ldquofirm k-medoidrdquo clustering method both separates clusters and distributes superconcepts equitably. Third, we show how the scores of SVM semantic filters can be more reliably and quickly converted to probabilities by using a closed-form approximation to SVM behavior between its margins. Fourth, we show that the quality of SVM semantic filters for hierarchical concepts can be analyzed by their ability to separate their positive ground truth examples from those of any other concept in the hierarchy; the most discriminating are those with ground truth showing distinctive physical backgrounds.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.