Abstract

AbstractContemporary developments in information communication technology (ICT) coupled with relatively easy access to the Internet as a global repository of most of the information produced by humanity today is creating a seeming information glut. By virtue of the present volume, velocity and variety of available information, the manual management of this bulk of information has become a practical impossibility. This has resulted in a need to automate the retrieval of unstructured information stored in texts such as prose and poetry as against the structured information stored in databases and look-up tables. These unstructured information sources feature natural language, bringing about a need to consider the language in which the information they contain are stored. There are vital commonalities in all human Languages, yet there are equally important unique elements in each. Methods for the retrieval of unstructured information stored in many European and Asian languages have been developed and are in productive use. Even though Africa contributes about a third of the languages spoken in the world today, very few if any African languages have benefited in any significant way from the automatic retrieval of information in unstructured information sources. This study addresses the issue of information retrieval from Yorùbá prose. The popular “term frequency-inverse document frequency” model for the automatic extraction of index terms from unstructured text was applied to the book Aké: Ní Ìgbà Èwe, the Yorùbá translation (Niger-Congo) of Aké: The Years of Childhood, written by Wole Soyinka and translated by Akínwùmí Ìsòlá. The liberal use of compounding in Yorùbá morphology was found to produce frequency distortions that would demand attention if the “term frequency-inverse document frequency” model is to be used for the automatic extraction of index terms from unstructured information sources written in Yorùbá.KeywordsInformation retrievalInverse document frequencyTerm frequencyYorùbá morphologyCompounding

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.