Query-competitive sorting with uncertainty

Magnús M Halldórsson,Murilo Santos De Lima

doi:10.1016/j.tcs.2021.03.021

Magnús M Halldórsson, Murilo Santos De Lima

Open Access

https://doi.org/10.1016/j.tcs.2021.03.021

Copy DOI

Abstract

We study the problem of sorting under incomplete information, when queries are used to resolve uncertainties. Each of n data items has an unknown value, which is known to lie in a given interval. We can pay a query cost to learn the actual value, and we may allow an error threshold in the sorting. The goal is to find a nearly-sorted permutation by performing a minimum-cost set of queries.We show that an offline optimum query set can be found in polynomial time, and that both oblivious and adaptive problems have simple query-competitive algorithms. The query-competitiveness for the oblivious problem is n for uniform query costs, and unbounded for arbitrary costs; for the adaptive problem, the ratio is 2.We then present a unified adaptive strategy for uniform query costs that yields the following improved results: (i) a 3/2-query-competitive randomized algorithm; (ii) a 5/3-query-competitive deterministic algorithm if the dependency graph has no 2-components after some preprocessing, which has query-competitive ratio 3/2+O(1/k) if the components obtained have size at least k; and (iii) an exact algorithm if the intervals constitute a laminar family. The first two results have matching lower bounds, and we have a lower bound of 7/5 for large components.We also give a randomized adaptive algorithm with query-competitive factor 1+433≈1.7698 for arbitrary query costs, and we show that the 2-query competitive deterministic adaptive algorithm can be generalized for queries returning intervals and for a more general graph problem (which is also a generalization of the vertex cover problem), by using the local ratio technique. Furthermore, we prove that the advice complexity of the adaptive problem is ⌊n/2⌋ if no error threshold is allowed, and ⌈n/3⋅lg⁡3⌉ for the general case.Finally, we present some graph-theoretical results regarding co-threshold tolerance graphs, and we discuss uncertainty variants of some classical interval problems.

Highlights

Sorting is one of the most fundamental problems in computer science and an essential part of any system dealing with large amounts of data
We show that the advice complexity of the adaptive problem is n/2 if no error threshold is allowed, and n/3 · lg 3 for the general case
High-performance algorithms such as QuickSort [19] have been known for decades, but the demand for fast sorting of huge amounts of data is such that improvements in sorting algorithms are still an active area of research; see, e.g., [26]

Summary

Introduction

Sorting is one of the most fundamental problems in computer science and an essential part of any system dealing with large amounts of data. We are given the actual data values, and want to identify a minimum-cost set of queries that would be sufficient to prove that the solution is correct Solving this problem is useful, for example, to perform experimental evaluation of online algorithms, since we are finding the offline optimum solution for the uncertainty problem. The first work to investigate the minimum number of queries to solve a problem is by Kahan [20], who showed optimal oblivious strategies to find the minimum, maximum and median of n values in uncertainty intervals. (The problem has lower bound 1.5 for randomized algorithms.) They considered non-uniform query costs and proved that their results can be extended to find a minimum-weight base on a matroid They showed that an optimum query set and the actual value of the minimum spanning tree can be computed in polynomial time.

Sorting with Uncertainty

Warm-Up

Deterministic Adaptive Algorithms

Improved Adaptive Algorithms for Uniform Query Costs

Advice Complexity for Adaptive Algorithms

Future Work Directions