Abstract

The increasing use of the internet has created a vast amount of digital information and it is expanding extremely fast. Therefore, Information retrieval becomes a challenging task to fetch relevant information for users. The aim of this paper was to examine and evaluate the performance of the Information retrieval system through eight experiments to test all the features that can be used in a vector space model. These experiments were compared to show the best and the worst implemented features. The features are represented by applying (tf.idf, stop words, stemming), (tf.idf, No- stop words, stemming), (tf.idf, No- stop words, No-stemming), (tf.idf, stop words, No-stemming), (tf, stop words, stemming), (tf, No- stop words, stemming), (tf, No- stop words, No-stemming), (tf, stop words, No-stemming). Results showed that using stop words, stemming approach, and tf.idf improve the performance of the system. However, when tf was used without using stop words and stemming approaches the performance of the system is declined. In addition, results showed that stop words have a significant effect on the system while the stemming approach has no noticeable effect particularly with tf.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call