Abstract

Problem statement: The identification of collocations is very important part in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. Because of the complexities of Arabic, the collocations undergo some variations such as, morphological, graphical, syntactic variation that constitutes the difficulties of identifying the collocation. Approach: We used the hybrid method for extracting the collocations from Arabic corpus that is based on linguistic information and association measures. Results: This method extracted the bi-gram candidates of Arabic collocation from corpus and evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. Conclusion: The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.

Highlights

  • The collocations issue is the linguistic phenomenon that is found in all the human languages. It is an important part in many applications, such as, machine translation, information retrieval, word sense disambiguation and lexicography

  • There is no free available Arabic corpus to use for collocation extraction

  • The third association measure is the Pointwise Mutual Information. This measure has been used as an association measure to rank the candidates of collocation by Zhang et al (2009)

Read more

Summary

Introduction

The collocations issue is the linguistic phenomenon that is found in all the human languages. Evert defined the collocation as “A word combination who semantic and/or syntactic properties cannot be fully predicted from those of its components and which has to be listed in a lexicon” (Evert, 2004). Another researcher, (Smadja, 1993) considered the collocations as “ recurrent combinations of words that co-occur more often than expected by chance and that correspond to arbitrary word usages”. There are two verbs ‘commit’ or ‘perpetrate’ which can combine with this noun to indicate the action As well as, this case can be applied in Arabic.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.