Thursday, October 28, 2010

FOCS Day 2: (much delayed)

I had to return to Salt Lake the evening of day 2, so blogging got left by the wayside. Piotr is to blame for this post :).

As usual, by day 2 (day 3 for me because of the tutorials day), my desire to attend talks far outstripped my ability to attend them, and I especially wanted to hear more about the Kumar/Kannan clustering with spectral norm paper, and the stability for clustering result of Awasthi, Blum and Sheffet. If I ever have control of setting the program for a conference, I'm going to decree that all talks start no earlier than 11am (they can go late in the evening, for all I care).

The first talk that I attended was by Alex Andoni on the new polylog approximation for the edit distance that he has with Robert Krauthgamer and Krysztof Onak. Dick Lipton actually spends a good deal of time on the problem and background in this post, and also highlights their earlier paper that gave a sub-polynomial appproximation in subquadratic time.

This paper goes one better, producing an algorithm that runs in time $O(n^{1+\epsilon})$ while yielding an $\log^{1/\epsilon}$ approximation. A key idea in this result is a very neat trick: rather than treating the strings x,y symmetrically, the algorithm first compresses x down to a much smaller string (of length n^\epsilon), and then runs a crude exact algorithm on this compressed string and y.

This compression is randomized: the algorithm subsamples small pieces of x. However, a straightforward sampling doesn't preserve the distance, so the algorithm actually starts by designing a careful hierarchical partitioning of x which induces a modified distance function (that they relate to the edit distance). Then the sampling takes place in this tree. Obviously I'm skipping many of the details that I didn't get, but Alex gave a very nice talk that laid out the hig level picture (his talks are always a model of clarity and intuition).

Another talk  that I found intriguing was the work by Saks and Seshadri on computing the longest increasing subsequence of a string. While there's a straightforward dynamic program for this problem, they showed that the length of the sequence could be estimated in "sublinear" time (in the size of the program). In fact, their technique is quite general, and might lead to a general method for approximating the output of a dynamic program in time sublinear in the size of the table. A key structural property that they develop is that either the dynamic program can easily be broken up into two balanced subrectangles, or it can be subsampled O(log n) times, yielding smaller copies that can be combined to find a solution (it's possible I'm misrepresenting the last bit). Sesh gave an excellent talk, but his taste in talk jokes was so terrible it was actually funny :) (sorry, Sesh!)

I deeply regret having to leave before the evening plenary session, and especially regret missing Dan Spielman's talk, which by all accounts was magnificient. All the talks were videotaped, so they'll eventually be online and I can watch it then.

No comments:

Post a Comment

Disqus for The Geomblog