What Can Pictures Tell Us About Web Pages? Improving Document Search Using Images.

Sergio Rodriguez-Vaamonde,Lorenzo Torresani,Andrew W Fitzgibbon

doi:10.1109/tpami.2014.2366761

Sergio Rodriguez-Vaamonde, Lorenzo Torresani + Show 1 more

Open Access

https://doi.org/10.1109/tpami.2014.2366761

Copy DOI

Abstract

Traditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion of each page. In this paper we study whether the content of the pictures appearing in a Web page can be used to enrich the semantic description of an HTML document and consequently boost the performance of a keyword-based search engine. We present a Web-scalable system that exploits a pure text-based search engine to find an initial set of candidate documents for a given query. Then, the candidate set is reranked using visual information extracted from the images contained in the pages. The resulting system retains the computational efficiency of traditional text-based search engines with only a small additional storage cost needed to encode the visual information. We test our approach on one of the TREC Million Query Track benchmarks where we show that the exploitation of visual content yields improvement in accuracies for two distinct text-based search engines, including the system with the best reported performance on this benchmark. We further validate our approach by collecting document relevance judgements on our search results using Amazon Mechanical Turk. The results of this experiment confirm the improvement in accuracy produced by our image-based reranker over a pure text-based system.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence	Publication Date: Jun 1, 2015
Citations: 30	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

What Can Pictures Tell Us About Web Pages? Improving Document Search Using Images.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence

Lead the way for us

Similar Papers

What can pictures tell us about web pages?
Sergio Rodriguez-Vaamonde ... Andrew Fitzgibbon
-
Sergio Rodriguez-Vaamonde, et. al.Sergio Rodriguez-Vaamonde ... Andrew Fitzgibbon
28 Jul 2013
28 Jul 2013

Developing a fuzzy search engine based on fuzzy ontology and semantic search
Lien-Fu Lai ... Pei-Ying Lin
-
Lien-Fu Lai, et. al.Lien-Fu Lai ... Pei-Ying Lin
01 Jun 2011
01 Jun 2011

사전과 말뭉치를 이용한 한국어 단어 중의성 해소
Hanjo Jeong ... Byeonghwa Park
Journal of Intelligence and Information Systems | VOL. 21
Hanjo Jeong, et. al.Hanjo Jeong ... Byeonghwa Park
31 Mar 2015
Journal of Intelligence and Information Systems | VOL. 21

Fuzzy Semantic Search Engine
Dharmish Shah ... Sindhu Nair
International Journal of Computer Applications | VOL. 107
Dharmish Shah, et. al.Dharmish Shah ... Sindhu Nair
18 Dec 2014
International Journal of Computer Applications | VOL. 107

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

What Can Pictures Tell Us About Web Pages? Improving Document Search Using Images.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence