Abstract
The efficient evaluation of top-k queries is crucial for many applications where a huge quantity of data should be ranked and sorted to return the best answers to users in a reasonable time. Examples include, e-commerce platforms (e.g., amazon.com), multimedia sharing platforms, web databases, etc. Most often, these applications need to retrieve data from autonomous data sources. The access to these data sources is carried out through popular Web APIs, such as data web services, to provide a standard way to interact with data. In this context, users’ queries often require the composition of multiple data services to be answered. Most of existing solutions for the evaluation of top-k queries assume data services to provide both sorted and random accesses to data or only a sorted access. In practice, however, some services may provide only a random access to data, which could impact the performance of the solutions. In this paper, we propose an approach to optimize the evaluation of top-k queries over data services. We consider the worst case scenario when services provide only a random access to data. Our approach defines two strategies: Pipeline Parallel Strategy and Necessary Invocation Principle to reduce the composition processing time and the number of unnecessary service invocations. Conducted experiments showcased the scalability and efficiency of our solution.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have