Ruminations on computational geometry, algorithms, theoretical computer science and life
Friday, March 31, 2006
SWAT 06
Thursday, March 30, 2006
Holy Joe Namath !!
Those who know me know that I am a rabid NFL fan.
Put these two facts together, and stir with a dose of this:
NY Jets moving HQ to Florham Park
Franchise will move its headquarters and training facility to the former Exxon complex
BY ROB JENNINGS
DAILY RECORD
FLORHAM PARK --The New York Jets will be building their new headquarters and training facility in Florham Park, Mayor Frank D. Tinari said today.
Under the agreement, the New Jersey Sports Exposition Authority will lease to the Jets 20 acres at the old Exxon site that it will be acquiring for $20 million, Tinari said.
The plan still awaits council and planning board approval, but Tinari was optimistic that the franchise defined by "Broadway" Joe Namath, Wesley Walker and Chad Pennington would soon be making its home in the borough.
2006 Gödel Prize
Wednesday, March 29, 2006
Grand Challenges ?
I've been thinking about this section mainly because I started asking myself, "What are the important questions in computational geometry". Given all the TheoryMatters discussions of late, this question mutated itself into the (not-unrelated) question, "What are some of the grand challenges in computational geometry" ?If you do not work on an important problem, it’s unlikely you’ll do important work. It’s perfectly obvious. Great scientists have thought through, in a careful way, a number of important problems in their field, and they keep an eye on wondering how to attack them. Let me warn you, ‘important problem’ must be phrased carefully. The three outstanding problems in physics, in a certain sense, were never worked on while I was at Bell Labs. By important I mean guaranteed a Nobel Prize and any sum of money you want to mention. We didn’t work on (1) time travel, (2) teleportation, and (3) antigravity. They are not important problems because we do not have an attack. It’s not the consequence that makes a problem important, it is that you have a reasonable attack. That is what makes a problem important. When I say that most scientists don’t work on important problems, I mean it in that sense. The average scientist, so far as I can make out, spends almost all his time working on problems which he believes will not be important and he also doesn’t believe that they will lead to important problems.
It's tricky to answer this question properly. It is quite easy to dash off a number of open problems in CG, but that is not so much the point of the second question. It's more about where the field is headed, and where we see some of the most interesting new problem areas opening up from.
Five or so years ago, the answer to this question might have been 'High Dimensional Approximate Geometry'. The core-set revolution has created a new set of techniques for dealing with approximate high-dimensional geometry, and this has fed off the developments in the theory of metric embeddings. I don't doubt that there is more work to be done here, but if we have to answer this question from the "funding perspective", i.e in terms of where geometry will be used next, I keep coming back to the issue of shape.
Some of the most important questions in biology over the next many years will involve structural analysis of proteins, and shape modelling is a key aspect of this. The work by Edelsbrunner and Mücke on alpha shapes connected ideas of shape to deeper ideas in combinatorial topology (the mathematical theory of shape, in one view), and the growing field of "computational topology" has been very active within the CG community of late.
I'm curious as to whether there are other ideas on what areas are likely to push CG forward over the next five to ten years.
Monday, March 27, 2006
Stanislaw Lem has passed...
Monday, March 20, 2006
Job posting.
AT&T Labs - Research has an opening in our Internet and Network
Systems Research Center. One of our needs is in the general area
of algorithms, data structures, optimization, and discrete mathematics
and we are looking for active PhD researchers in these areas who also
want to have an impact on the real world. To see the researchers we
currently have in these areas, go to
http://www.research.att.com/~dsj/algs/people.php
Interested potential candidates should upload their resume to our website
http://www.research.att.com/academic
and let David Johnson (dsj AT research DOT att DOT com) know that they have done so.
AT&T is an Equal Opportunity Employer.
Friday, March 03, 2006
It's...
Many methods have been developed in the past—see, for example, the comprehensive book by Devroye [1986]—and methods are usually given names to identify them. I, Marsaglia, called this the Monty Python method when it was developed, some 20 years ago, because opening graphics on the British television show Monty Python’s Flying Circus resembled the essential element of the method. The zany Monty Python crew pictured a stylized head with a hinged top that folded open, with all kinds of silliness pouring out. The Monty Python Method has an analogous hinged top that is folded over to suggest, among other things, an interesting way to generate a random variable. You may judge for yourself its silliness.
Adversaries
The first is a robust formal definition of efficiency, or tractability. In fact, the idea of a "reasonable" computing model works very well in cryptography, where ever since the days of RSA, the game has been to prevent attacks by a limited adversary, rather than an allpowerful one (which would be much harder).
Which brings up a second core contribution; the idea of worst-case performance, and an adversary that can do anything. One thing that I've often noticed when working with people in other areas is that they are surprised when I can prove that such-and-such technique is optimal, or approximately optimal, or efficient. The power of worst-case analysis is in the ability to make such sweeping guarantees, and it's so ingrained into our psyche that it can be surprising to have some one else note this explicitly.
But worst-case analysis has its limits, and David alludes to this with his question on why anyone would ever assume that an adversary was not worst-case:
This assumption makes fields that assume some kind of distribution on inputs or on errors seem extremely strange. Electrical engineering does this in the context of coding theory, and it's also common in statistical learning theory. These fields have enormous practical impact and relevance, which I acknowledge and appreciate. I just have a hard time with the shock of "well, why do we believe errors must be distributed like that?"In fields like game theory and cryptography, your adversary is assumed malicious; this is often true, especially code hackers, and is a good modelling assumption (think "selfish agents" and the like). Theorems proved here are "pure"; you start off with a set of assumptions about your world, and the results have guarantees within this world.
But when you start getting into areas like machine learning, data mining, and data modelling in general, your "adversary" is Nature, or some unknown indifferent process that generates data. Time and again we see what we would formally think of as heuristics work very well "in practice" and often the explanation is that the algorithm is able to exploit some property of the data (something we can then prove after the fact). Whether it's expectation-maximization, its cousin k-means, ICP, or even the simplex method, we see the same story over and over again.
Worst-case analysis, applied straight up, does not help us in these domains. Worst-case analysis often provides bounds that are weak, and algorithms that are ineffective on many kinds of data sets. To use formal methods, you need to dig deeper into a problem, and try to understand the data sources themselves, and often you come up with hidden structure that can be exploited formally.
I can answer David's question by going all the way back to Gauss. According to Leonard Mlodinow, it was when Gauss was overseeing a surveying project that he discovered that the errors in measurement followed a bell curve (hence, Gaussian distributions). The central limit theorem tells us that all kinds of i.i.d error will end up looking Gaussian after a while. Many natural processes also generate power law or log normal distributions. The point is that if you're dealing with nature, then more often than not you're dealing with data that is stochastic rather than adversarial, and modelling it as such gives you a better handle on algorithm design.