Abstract

Let P be a point set in ℝ d , and let M be a function that maps any subset of P to a positive real. We examine the problem of computing the mean and variance of M when a subset in P is selected according to a random distribution. We consider two distributions; in the first distribution (the Bernoulli distribution), each point p in P is included in the random subset independently, with probability π( p ). In the second distribution (the fixed-size distribution), exactly s points are selected uniformly at random among all possible subsets of s points in P . We present efficient algorithms for computing the mean and variance of several geometric measures when point sets are selected under one of the described random distributions. We also implemented four of those algorithms: an algorithm that computes the mean 2D bounding box volume in the Bernoulli distribution, an algorithm for the mean 2D convex hull area in the fixed-size distribution, an algorithm that computes the exact mean and variance of the mean pairwise distance (MPD) for d -dimensional point sets in the fixed-size distribution, and an (1 − ε)-approximation algorithm for the same measure. We conducted experiments where we compared the performance of our implementations with a standard heuristic approach, and we show that our implementations are very efficient. We also compared the implementation of our exact MPD algorithm and the (1 − ε)-approximation algorithm; the approximation method performs faster on real-world datasets for point sets of up to 13 dimensions, and provides high-precision approximations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call