Abstract

The performance of indexing systems is very important for a search engine. Usually, indexing systems on large-scale clusters can provide high search efficiency, but it brings expensive hardwa re costs. The costs would be greatly reduced if a distributed indexing system runs on small-scale clusters connected by the Internet. Two current inverted file partitioning schemes: document partitioning and term partitioning, have their merits individually. A two-tier distributed full-text in dexing system is implemented, which uses document partitioning among the clusters and term partitioning inside each cluster. Our e xperiments show that the system performs well in search efficiency, resource consuming and load balance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call