Abstract

In this paper, we study computational models and techniques to merge textual and image features to classify multimedia documents into semantically meaningful groups. A vector-based framework is used to index documents on the basis of textual, pictorial and composite (textual-pictorial) information. The scheme makes use of weighted document terms and color invariant image features to obtain a high-dimensional image descriptor in vector form to be used as an index. Based on supervised learning, a classifier is used to organize the multimedia documents. Due to space limitations, in this paper, we focus on the application of classifying/finding pictures of people on the Internet. Performance evaluations are reported on the accuracy of merging textual and pictorial information for classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.