Abstract

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from external knowledge resources. However, these solutions are not well explored for the general web search in an open-domain setting. In addition, they mostly focus on supporting search in content expressed in English and Latin based languages. In this research, we propose a fully automated approach that aims to support exploratory search over the Arabic web content. It exploits the Arabic version of Wikipedia to extract complementary information that supports visual representation and deeper exploration of the search engine's results. Key Wikipedia entities are extracted from the text snippets produced by the search engine in response to the user's query. Entities are then filtered and ranked by using a novel ranking algorithm that extends the conventional PageRank algorithm. Finally, a graph is built and presented to the user to visually represent highly ranked topics and their relationships. The proposed approach was realized by developing ArabXplore, a system that integrates with the web browser to support the web search process by executing our approach in query time. It was assessed over a dataset of 100 Arabic search queries covering different domains, and results were assessed and rated by human subjects. The underlying ranking algorithm was also compared with the conventional PageRank.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.