Abstract

Most Arabs can read text written in Modern Standard Arabic (MSA). However, to easily express themselves, they may find it easier to switch to informal (colloquial) Arabic. The web is open for anyone to express him/herself freely, and people are expressing themselves through many social media platforms, such as blogs and forums increasingly in their native colloquies. Search engines are very good at handling queries in MSA, though not as good if the query is written in colloquial Arabic. Two issues will be addressed in this paper. First, many younger generation Arabs find it hard to write in MSA, which means that many results are missed due to improperly posted queries; and second, a query written in MSA will not retrieve documents written in colloquial Arabic. Thus, with the goal of universal accessibility of the web to all Arabic users, we need a successful mechanism that translates the query back and forth between MSA and the variety of colloquies spread throughout the Arab countries. As a case study, we investigate one of the local dialects in Saudi Arabia, a leading country in social media usage much of which is in colloquial language. We present a web information retrieval system for Arabic that addresses this concern. To test the proposed method, we compiled a corpus of over fourteen hundred documents and measured the performance of our system using 50 sample queries achieving an average recall and precision of 93.4 and 83.6%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call