Abstract

We have attempted in this paper to reduce the number of checked condition through saving frequency of the tandem replicated words, and also using non-overlapping iterative neighbor intervals on plane sweep algorithm. The essential idea of non -overlapping iterative neighbor search in a document lies in focusing the search not on the full space of solutions but on a smaller subspace consideringnon-overlapping intervals defined by the solutions. Subspace is defined by the range near the specified minimum keyword. We repeatedly pick a rangeup and flip the unsatisfied keywords, so the relevant ranges are detected.The proposed methodtries to improve the plane sweep algorithm by efficiently calculating the minimal group of words and enumerating intervals in a document which contain the minimum frequency keyword.It decreases the number of comparisonand creates the best state of optimized search algorithm especially in a high volume of data. Efficiency and reliability are also increased compared to the previous modes of the technical approach.

Highlights

  • The most exceptional search engine would not provide good quality results if the original keywords selected by the user were not suitable

  • Plane sweep algorithm considers that keywords which appear in the neighbourhood in a document that are related

  • We perform a search for all subsets at query time, in a case which a word overlaps itself repeatedly, we count the number of ordered pairs of symbols that are adjacent in the document and by using the iterated partial Search, and we limit the search space

Read more

Summary

INTRODUCTION

The most exceptional search engine would not provide good quality results if the original keywords selected by the user were not suitable. We define ranks of regions in documents which contain all specified keywords in order of their sizes This is called proximity search [4,5,8]. We perform a search for all subsets at query time, in a case which a word overlaps itself repeatedly, we count the number of ordered pairs of symbols that are adjacent in the document and by using the iterated partial Search, and we limit the search space. It will be performed in less time especially at high data storage. Running time of the proposed algorithm can be achieved in time o((n − )log k) , where n is the frequency of keywords occurrence in a document, is the frequency of tandem replicated data and k is the number of query terms in a query

RELATED WORKS
PROPOSED ALGORITHM
TEST RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call