Abstract

This paper proposes the two-level image retrieval that combines the text and image-based, to overcome the disadvantage of text or image as a query for image retrieval. In the text-based retrieval, three main steps are required. First, text and document pre-processing to retrieve words without affix, punctuation, and any stop words, to build the dictionary. Second, weighting the word from the dictionary, based on the frequency of words in text or document, using the Term Frequency-Inverse Document Frequency Model. Third, the similarity between a text query and the text document is calculated using Cosine Similarity. In the image-based approach for image retrieval, two main steps are required. First, feature extraction using Integrated Color Intensity Co-occurrence Matrix. This method will obtain two features at once, texture and color feature. Second, the similarity is calculated between an image query and database using Manhattan Distance. Social Media Data, Twitter, with Indonesian tweet and users, is used for the experiments. Image retrieval using a text, an image, and combination of both text and image, are compared in the experiments. The conducted experiments showed that the combination of text and image-based retrieval achieved the highest performance accuracy, compare with text or image-based retrieval.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call