An Algorithm for Optimized Searching Using Non-Overlapping Iterative Neighbor Intervals

Elahe Moghimi Hanjani

doi:10.5121/ijcsea.2012.2505

Abstract

We have attempted in this paper to reduce the number of checked condition through saving frequency of the tandem replicated words, and also using non-overlapping iterative neighbor intervals on plane sweep algorithm. The essential idea of non -overlapping iterative neighbor search in a document lies in focusing the search not on the full space of solutions but on a smaller subspace consideringnon-overlapping intervals defined by the solutions. Subspace is defined by the range near the specified minimum keyword. We repeatedly pick a rangeup and flip the unsatisfied keywords, so the relevant ranges are detected.The proposed methodtries to improve the plane sweep algorithm by efficiently calculating the minimal group of words and enumerating intervals in a document which contain the minimum frequency keyword.It decreases the number of comparisonand creates the best state of optimized search algorithm especially in a high volume of data. Efficiency and reliability are also increased compared to the previous modes of the technical approach.

Highlights

The most exceptional search engine would not provide good quality results if the original keywords selected by the user were not suitable
Plane sweep algorithm considers that keywords which appear in the neighbourhood in a document that are related
We perform a search for all subsets at query time, in a case which a word overlaps itself repeatedly, we count the number of ordered pairs of symbols that are adjacent in the document and by using the iterated partial Search, and we limit the search space

Summary

INTRODUCTION

The most exceptional search engine would not provide good quality results if the original keywords selected by the user were not suitable. We define ranks of regions in documents which contain all specified keywords in order of their sizes This is called proximity search [4,5,8]. We perform a search for all subsets at query time, in a case which a word overlaps itself repeatedly, we count the number of ordered pairs of symbols that are adjacent in the document and by using the iterated partial Search, and we limit the search space. It will be performed in less time especially at high data storage. Running time of the proposed algorithm can be achieved in time o((n − )log k) , where n is the frequency of keywords occurrence in a document, is the frequency of tandem replicated data and k is the number of query terms in a query

RELATED WORKS

PROPOSED ALGORITHM

TEST RESULTS

CONCLUSION