Tuesday, April 19, 2011

POTD: Witnessed k-distance

Distance function to a compact set plays a central role in several areas of computational geometry. Methods that rely on it are robust to the perturbations of the data by the Hausdorff noise, but fail in the presence of outliers. The recently introduced distance to a measure offers a solution by extending the distance function framework to reasoning about the geometry of probability measures, while maintaining theoretical guarantees about the quality of the inferred information. A combinatorial explosion hinders working with distance to a measure as an ordinary (power) distance function. In this paper, we analyze an approximation scheme that keeps the representation linear in the size of the input, while maintaining the guarantees on the inference quality close to those for the exact (but costly) representation.

The idea of defining distances between (or to) shapes robustly by replacing the shape by a distribution is near to my heart (see this paper for more). The authors provide an algorithmic perspective on a formulation first proposed by Chazal, Cohen-Steiner and Mérigot. The idea is that instead of constructing a shape from a point cloud by growing balls of fixed radius and measuring the distance to this union of balls, one constructs a measure by growing balls out to a fixed measure, and then defining a distance. The distance measure has the nice property of being Lipschitz-related to the EMD, making it stable. In a manner reminiscient of the Aurenhammer et al work, they relate this distance to a power diagram construction, and then design efficient approximations for it (because the exact version involves terms exponential in the thresholding parameter).

It seems to me that they're going to a lot of work to recover from what I think is a tactical problem: fixing a sharp threshold for the ball measure. It might be interesting to explore alternate ways of defining the measure that are "smoother" - maybe using a kernel instead of a hard measure boundary. It might yield an alternate measure that serves the same purpose but is easier to compute.

No comments:

Post a Comment

Disqus for The Geomblog