(Note to graduate students everywhere; when someone tells you that no question is a stupid question, don't believe it)

But after attending Philippe Flajolet's talk today on "Analytic Combinatorics", and after hearing Luc Devroye's talk yesterday, I'm not so sure that my question was off the mark.

A bit of background. What we refer to as "the analysis of algorithms" is usually associated with Don Knuth and the Art of Computer Programming. It referred to the initial methods being developed to analyze structures like hash tables, search trees and the like. Most computer scientists have taken some kind of discrete math class, and have seen the Knuth-Graham-Patashnik "Concrete Mathematics" textbook, and much of the content (basic combinatorics, recurrence relations, generating functions, etc) was used for early algorithm analysis.

These methods were quite precise. It's not uncommon to look back at papers from the 70s and see detailed constants in front of running times for algorithms; Bob Sedgewick mentioned one such calculation in his introduction to Flajolet's talk. People didn't use O() notation like a sledgehammer, the way we do today.

Over time, we became more comfortable with O() notation; algorithms became more sophisticated, and it became harder to figure out actual constants. It didn't seem to matter as much. After all, when you were coming up with elaborately complicated algorithms that ran in exotic times like O(n^(11/7)), it hardly seemed to matter what the constant was. This was, and has continued to be, the Age of Design.

But all along, with people like Flajolet, Knuth, Sedgewick, and many many others, the work of really analyzing algorithms continued on. ANALCO is an offshoot of this long effort; a way to attract people working on some of the considerably hard problems in this area, while creating some cross-fertilization with the design crowd at SODA. Of course, by no means do I claim that algorithm designers don't analyze; of course we do. But it's fair to say that the sophisticated analysis methods and sophisticated design methods do appear to have diverged.

Which brings us to the talk today. The rough theme for his talk was an overview of how the combinatorial problem of counting structures (trees, steps in a linear probe, etc) can be transformed into a generating function, to which then the methods of real, and more recently, complex analysis can be applied. There's some pretty heavy mathematical machinery being thrown out here: we saw large deviation theory in yesterday's talk, for example, and there are things Flajolet talked about that I have only the barest inkling of.

Doing such analysis is hard; it's not as if we're suddenly going to abandon O() notation. But, as Piotr Indyk pointed out when we discussed this later, computers aren't getting any faster, and data is getting larger and larger, and it's more and more true that the actual constants in front of a running time matter, sometimes even more than the asymptotic bound. If more sophisticated analysis methods allow us to reveal algorithm running times more transparently, this also helps repair some of the "bad press" theoreticians can get with more applied folk.

So the analysis of algorithms takes on its original meaning again; there is a conference as well, now in its 13th year, and there's an upcoming book by Flajolet and Sedgewick that covers much of the mathematics Flajolet refers to in his talk. I looked at it briefly (it's 753 pages and counting!), and I hope that when it does come out, we learn more about how to use methods from analytic combinatorics to improve analysis techniques for even our run-of-the-mill algorithms.

Outtakes:

- I've quite enjoyed the talks I've attended thus far. I haven't written much about them, but that's mostly due to laziness on my part. I've been quite torn having to navigate the multiple parallel sessions; human cloning, where art thou ?
- [From a random research paper session] It's funny to see a speaker struggling with their desire to display EVERY SINGLE SLIDE that they made, when faced with a remorseless clock ticking down to zero. Word of advice: no one really cares if you go through all your slides, or even flip thru them frantically while muttering very fast. They do care if you go over time and drag things ON and ON and ON.
- Contrary to the general confusion being spread around, the wireless DOES work and it IS free.
- I don't like hotels with two towers; especially when I'm in one and the conference is in the other, and ESPECIALLY when the only connection between the two towers is at the lobby.

This doesn't make any sense. You could call me an applied theorist; I read theory papers and do some theory but also write a lot of code. I work with large datasets.

ReplyDeleteFirst, there is no evidence at all that computers aren't getting faster as much as they used to. Even if individual components don't get faster, their cost drops, so you can compute massively parallely like google does.

Second, as data sizes get larger, the input size parameter 'N' increases, and so the difference between, say, a quadratic and log-linear time becomes greater and greater, making the constants *less* important.

If theory gets bad press, it is for things like the obsession with "polynomial time = efficient". My pet peeve is that nontrivial models for the input distribution of the data are never considered, only the two extremes of worst-case and uniformly random are ever talked about. Yet, real life data is always somewhere in between, and you need to have a model for it. Theory has the tools to do this. For instance, "a data stream generated by a small space source." Yet such modeling is rarely done.

The Google model of computation doesn't extend to everything though: I mean, if I have a small system running experiments on some data sets, I can't use the google model to deal with my data: I still have to work with my one or two machines and do what I can with it.

ReplyDeleteIt is indeed true that as sizes get larger, asymptotic bounds become *more* relevant. But at the same time it is also true that the difference between 2N and 150N becomes important, especially for streaming problems where *anything* superlinear is hopeless.

In my view, the whole 'polytime=efficient' is completely misunderstood, but that's a story for another day. Maybe we do need better models for real data, but where might they come from ? a small space source not such a well understood entity, is it ? we don't even know if L is strictly contained in P !!!

"In my view, the whole 'polytime=efficient' is completely misunderstood, but that's a story for another day."

ReplyDeleteI'd love to hear about it. I've seen the standard arguments in defense of it from theory people and I'm not really convinced.

"Maybe we do need better models for real data, but where might they come from ? a small space source not such a well understood entity, is it ?"

That's precisely my point. There needs to be more work done in understanding them. For instance there was this recent paper http://theory.lcs.mit.edu/~cpeikert/pubs/mpsw.pdf Optimal error correction against comptuationally bounded noise, based on the idea that the universe is a computer, and this includes noise. I really feel that looking at space bounded noise is as important as computationally bounded, and I was looking into this with a couple of other people and I still have a 2 page draft of ideas buried under a mountain of other stuff. But really, proper theory people should be looking at this stuff.

we don't even know if L is strictly contained in P !!!

A complete red herring. We don't know if P is strictly contained in NP either, but that didn't stop anyone.

P.S blogger has the worst commenting system ever. In particular, I have no way to be notified of responses and I hate captchas and the text box is too small.

BTW I think the Micali et al paper is an excellent one. I have nothing against it. Just pointing out that there are other things in the world beside polynomial time.

ReplyDelete