Abstract

Information retrieval is the process of obtaining and presenting more related information from the largest collection of information resources according to the user’s need. The tremendous growth in information resources on the Internet makes the information retrieval process a tedious and difficult task for users. Due to information overloading, there is a need for better techniques to retrieve most relevant information from web. This paper presents the information retrieval system by using the PSO algorithm. In presented system, to extract the text from web documents, all html tags are removed. After that stop words and special characters are removed from extracted text for recovering only meaningful contents. TF-IDF concept is used for feature selection. Now PSO optimization technique is used for identifying and refining the features set, these selected features are stored in a database which is used for information retrieval process. In other hand input query is converted into more than one similar semantic query strings. These query strings are compared with the obtained feature sets in the database by using the cosine similarity function. The most similar text is retrieved as an outcome of the information retrieval system.

Highlights

  • Information retrieval plays a vital role in web search engines to access most relevant information according to the user's input query

  • Information retrieval system is used in many application areas such as digital libraries, information filtering, recommendation system, media search, image retrieval etc. [3]

  • In this proposed information retrieval system, it initially accepts the web pages as the input dataset for information processing and retrieval system

Read more

Summary

INTRODUCTION

The World Wide Web is the collection of many interlinked hypertext documents It provides the huge amount of information which is accessed via Internet by using hypertext transfer protocol. Information retrieval plays a vital role in web search engines to access most relevant information according to the user's input query. It is a mainstream and the basics of web search engines. Whenever a user needs to access the information, it is necessary to enter a formal statement into a search engine This formal statement, known as a search engine’s input query. Most relative to least relative information resources will be shown to the user [2] Web search engines such as Bing, Yahoo, Google, Excite, AltaVista etc. Information retrieval system is used in many application areas such as digital libraries, information filtering, recommendation system, media search, image retrieval etc. [3]

PSO Algorithm
Web page indexing
Web page ranking
Text feature extraction
LITERATURE REVIEW
PROPOSED WORK
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call