Abstract

The centralized search engine has problems of excessive server load and limited extended ability when dealing with the massive Internet information. And the search results of general search engine is not so accurate. To solve these problems, a vertical search engine based on Hadoop called HVSE was designed and developed. HVSE was based on the basic principle of the traditional search engine. It improved the current algorithms of topic oriented web crawler, worked in the distributed cluster environment, used the Lucene and other technologies, combined with MapReduce programming model to carry out data processing. Demonstrated by the experiment, the efficiency of HVSE is higher than that of the centralized search engine when dealing with massive data, and the precision of the retrieval results is higher than that of the general search engine.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.