Tuesday, July 22, 2008

The concept of an approximation

We're always searching for new ways to communicate cool ideas in theoryCS to the outside world, and the list gets more and more esoteric each time one thinks about it. But in my interactions with people, people generally unfamiliar with algorithm, and theoretical computer science in general, I'm constantly amazed at how counter-intuitive the very notion of an approximation appears to be.

A typical conversation goes something like this:

Person: Oh this problem is NP-hard so we ran this heuristic
Me: Have you tried approximating it ?
P: What do you mean ? Our heuristic seems to be reasonable.
M: I mean an algorithm that guarantees that it'll get close to optimal.
P: But how can you know that ? It's hard to find the optimal solution, global minimum, yadda yadda yadda...
M: We can often prove an answer is close to OPT without knowing OPT itself.
P: [total silence]

This kind of thing happens often in research conversations, as well as in the classroom (where it's less surprising I guess). In fact, one person was so amazed at the idea that even proving an approximation could be NP-hard that we spent a good semester plowing through the PCP theorem (at least parts of it).

I'm not dissing heuristics: it's often the case that either the approximation guarantees one can show are too weak, or that the claimed bounds don't match the empirical performance. But I do think it's important to at least ask the question of whether provable approximations exist, much in the way people in CS are now reasonable comfortable checking (or asking the resident theoretician !) if a problem is NP-hard.

But you can't ask till you know. This is one of the main reasons I teach a smattering of approximations in my graduate algorithms class (and why I teach some randomization as well). The concept of an approximation (especially when you can't find the actual answer) is a surprisingly slippery conceptual leap, and requires mental agility that goes beyond the standard "algorithms toolkit" that focuses on issues of performance.

Disqus for The Geomblog