Abstract

We consider versions of the FIND algorithm where the pivot element used is the median of a subset chosen uniformly at random from the data. For the median selection we assume that subsamples of size asymptotic to $c \cdot n^\alpha$ are chosen, where $0 < \alpha \leq \frac{1}{2}$, $c > 0$ and $n$ is the size of the data set to be split. We consider the complexity of FIND as a process in the rank to be selected and measured by the number of key comparisons required. After normalization we show weak convergence of the complexity to a centered Gaussian process as $n \to \infty$, which depends on $\alpha$. The proof relies on a contraction argument for probability distributions on càdlàg functions. We also identify the covariance function of the Gaussian limit process and discuss path and tail properties.

Highlights

  • After normalization we show weak convergence of the complexity to a centered Gaussian process as n → ∞, which depends only on α

  • The FIND algorithm is a selection algorithm, called Quickselect, to find an element of given rank in a set S of data, where the data set S is a subset of finite cardinality |S| of some ordered set

  • By induction we find that (Zn)n≥0 is a sequence of centered Gaussian processes

Read more

Summary

Introduction

The FIND algorithm is a selection algorithm, called Quickselect, to find an element of given rank in a set S of data, where the data set S is a subset of finite cardinality |S| of some ordered set. Martínez and Roura [35] give an average case analysis, where optimal choices for the tradeoff between better balanced sublists versus additional cost for the median selection are discussed Note that another idea to adapt the FIND algorithm is to not choose the median of a subsample but to choose an element that may depend on the rank searched for such that the sublist where the algorithm is recursively called may be small. A similar version of the Quicksort algorithm consists in choosing the pivot element in each step as a median of a random sub-sample of size k = k(n) ∼ cnα with n the size of the list to be split We conjecture that such a Quicksort algorithm admits a Gaussian limiting distribution for the normalized number of key comparisons. The second, Lemma 4.3, is needed in the study of the path variation of the limit process Z

Construction
Characterization of the limit process
Analysis of the Quickselect process
Preliminaries
Further properties of the limit process
The supremum of the limit process
Variation of paths
Binary topology and path continuity
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.