Implemented Stemming Algorithms for Information Retrieval Applications

Wubetu Barud Demilie

doi:10.7176/ikm/10-3-01

Abstract

Now a day’s text documents are advancing over internet, e-mails and web pages. As the use of internet is exponentially growing, the need of massive data storage is increasing from time to time. Normally many of the documents contain morphological variables, so stemming which is a preprocessing technique gives a mapping of different morphological variants of words into their base word called the stem. Stemming process is used in information retrieval applications accordingly as a way to improve retrieval performance based on the assumption that terms with the same stem usually have similar meaning. To do stemming operation on bulky documents, we require normally more computation time and power, to cope up with the need to search for a particular word in the data. In this paper, various stemming algorithms are analyzed with the benefits and limitation of the recent stemming methods or approaches. Keywords : - Natural Language Processing Applications, Information Retrieval, Information Retrieval Applications (IRAs), Stemming Approaches DOI: 10.7176/IKM/10-3-01 Publication date: April 30 th 2020

Highlights

In all Information Retrieval applications, the main thing is to improve recalls and precisions
Stemming is a preprocessing footstep in text mining applications as well as a very common requirement of natural language processing functions
The capacity of the search database has increased in the last few years, so in order to meet the challenge of real time search natural language application algorithms speed up required

Summary

Introduction

In all Information Retrieval applications, the main thing is to improve recalls and precisions . The capacity of the search database has increased in the last few years, so in order to meet the challenge of real time search natural language application algorithms speed up required Those texts typically consist of many different syntactic variants for example connected, connect, connecting, connection, connectedly, connectedness, connectively, connectional, connective, connectable (adjective), connector (noun) all are derived word of root word “connect”(Tesfaye 2010)(Tesfaye n.d.). Successor Variety Approach According (Sousa and Castro n.d.) to successor variety is one of the stemming approaches in natural language processing applications including especially, in information retrieval processing systems In this approach, the successor variety of a string is the number of different characters that follow the string in words in a corpus (the www.iiste.org body of text). If we use B-tree or hash table lookup such would be fast, but there is a problem of storage overhead for such table(Bellovin and Rescorla 2005)

N-Gram Method

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Implemented Stemming Algorithms for Information Retrieval Applications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information and Knowledge Management

Lead the way for us

Journal: Information and Knowledge Management	Publication Date: Apr 1, 2020
License type: cc-by

Similar Papers

Modified LCM’S Approximation Algorithm for Solving Transportation Problems
...
Journal of Information Engineering and Applications | VOL. 10
, et. al. ...
01 Apr 2020
Journal of Information Engineering and Applications | VOL. 10

Implemented Stemming Algorithms for Information Retrieval Applications

Journal of Information Engineering and Applications | VOL. 10

01 Apr 2020
Journal of Information Engineering and Applications | VOL. 10

Enhanced Text Retrieval Using Natural Language Processing
Elizabeth D Liddy
Bulletin of the American Society for Information Science and Technology | VOL. 24
Elizabeth D LiddyElizabeth D Liddy
01 Apr 1998
Bulletin of the American Society for Information Science and Technology | VOL. 24

Graph-Based Natural Language Processing and Information Retrieval Rada Mihalcea and Dragomir Radev (University of North Texas and University of Michigan) Cambridge, UK: Cambridge University Press, 2011, viii+192 pp; hardbound, ISBN 978-0-521-89613-9, $65.00
Chris Biemann
Computational Linguistics | VOL. 38
Chris BiemannChris Biemann
01 Mar 2012
Computational Linguistics | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Implemented Stemming Algorithms for Information Retrieval Applications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information and Knowledge Management