Abstract

The proliferation of the Internet has caused the process of browsing and searching for information to become extremely cumbersome. While many search engines provide reasonable information., they still fall short by overwhelming users with a multitude of often irrelevant results. This problem has several causes but most notably is the inability for the user to be able to convey the context of their search. Unfortunately, search engines must assume a general context when looking for matching pages, causing users to visit each page in the result list to ultimately find or not find their desired result. We believe that the necessity of visiting each page could be removed if the concepts, i.e. over-arching ideas of the underlying page, could be revealed to the end user. This would require mining the concepts from each referenced page. It is our contention that this could be done automatically, rather than relying on the current convention of mandating that the searcher extract these concepts manually through examination of result links. This ability to mine concepts would not only be useful to finding the appropriate result but in further identifying relevant pages. We present the Automatic Concept Extraction (ACE) algorithm, which can aid users performing searches using search engines. We discuss ACE both theoretically, and in the context of a graphical user interface and implementation which we have constructed in Java to aid in qualitatively evaluating our algorithm. ACE is found to perform at least as well or better than 4 other related algorithms, which we survey in the literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.