Ruminations on computational geometry, algorithms, theoretical computer science and life

## Friday, September 30, 2005

### Fall Workshop on Computational Geometry, 2005

For the first time, the workshop will be held at the University of Pennsylvania, in Philadelphia. Submission deadline for 2 page abstracts is Oct 19, with results announced a week later. As always, we encourage submissions for work in progress, being submitted, already published; the workshop has no formal proceedings, and the participatory atmosphere is one of the best things I have liked about workshops past.

Reactions: |

## Wednesday, September 28, 2005

### Somewhere, Edsger Dijkstra is smiling:

Old-school computer science called for methodical coding practices to ensure that the large computers used by banks, governments and scientists wouldn't break. But as personal computers took off in the 1980s, companies like Microsoft didn't have time for that. PC users wanted cool and useful features quickly. They tolerated -- or didn't notice -- the bugs riddling the software. Problems could always be patched over. With each patch and enhancement, it became harder to strap new features onto the software since new code could affect everything else in unpredictable ways.The UNIX philosophy makes sense:

A newcomer to the Windows group, Mr. Srivastava had his team draw up a map of how Windows' pieces fit together. It was 8 feet tall and 11 feet wide and looked like a haphazard train map with hundreds of tracks crisscrossing each other.

That was just the opposite of how Microsoft's new rivals worked. Google and others developed test versions of software and shipped them over the Internet. The best of the programs from rivals were like Lego blocks -- they had a single function and were designed to be connected onto a larger whole. Google and even Microsoft's own MSN online unit could quickly respond to changes in the way people used their PCs and the Web by adding incremental improvements.

Reactions: |

### Responses to journal referee reports:

And now:Dear Sir,

We (Mr. Rosen and I) had sent you our manuscript for publication and had not authorized you to show it to specialists before it is printed. I see no reason to address the—in any case erroneous—comments of your anonymous expert. On the basis of this incident I prefer to publish the paper elsewhere.

Respectfully,P.S. Mr. Rosen, who has left for the Soviet Union, has authorized me to represent him in this matter.

Dear Sir, Madame, or Other:Enclosed is our latest version of Ms # 85-02-22-RRRRR, that is, the re-re-re-revised revision of our paper. Choke on it. We have again rewritten the entire manuscript from start to finish. We even changed the goddamn running head! Hopefully we have suffered enough by now to satisfy even you and your bloodthirsty reviewers.

I shall skip the usual point-by-point description of every single change we made in response to the critiques. After all, it is fairly clear that your reviewers are less interested in details of scientific procedure than in working out their personality problems and sexual frustrations by seeking some kind of demented glee in the sadistic and arbitrary exercise of tyrannical power over helpless authors like ourselves who happen to fall into their clutches. We do understand that, in view of the misanthropic psychopaths you have on your editorial board, you need to keep sending them papers, for if they weren't reviewing manuscripts they'd probably be out mugging old ladies or clubbing baby seals to death. Still, from this batch of reviewers, C was clearly the most hostile, and we request that you not ask him or her to review this revision. Indeed, we have mailed letter bombs to four or five people we suspected of being reviewer C, so if you send the manuscript back to them the review process could be unduly delayed.

[.....]

Reactions: |

## Monday, September 26, 2005

### Dickson Prize for David Haussler

He was just awarded the Dickson Prize by CMU.

Reactions: |

## Tuesday, September 20, 2005

### Jon Kleinberg, MacArthur Fellow

Reactions: |

## Sunday, September 18, 2005

### Holes in Sets: The Conclusion

For a fixed k, does there exist an n = n(k) such that any point set in the plane of size at least n(k) contains an empty convex polygon (a "hole") of size k ?Horton showed that no such n exists for k = 7. Specifically, for any n, there exists a set of n points in the plane that had no empty convex heptagon. Conversely, Bárány and Füredi showed that for k = 5, for sufficiently large n, any point set of size n had an empty convex pentagon.

For k = 6, the matter was unresolved, until just recently. This week's combinatorics seminar at NYU is by Andres Varon, who announces that Tobias Gerken has just closed this gap, showing that such an n exists for k = 6. If and when I get more details, I'll post them here.

Update: David Eppstein and Libertarianoid mention this as well. Libertarianoid has further details:

The conjecture was recently resolved by Tobias Gerken in affirmative. As Matousek says:

"P. 34, the6-hole problem was resolvedin 2005 by Tobias Gerken, who showed by a clever case analysis that any set containing 9 points in convex positions has a 6-hole. The proof was simplified by Pavel Valtr to about 4 pages (but one needs more than 9 points in convex position to start with, and hence the resulting upper bound for n(6) is worse)."

The original proof is somewhat long (around 39 pages), but elementary and not too hard to read. Valtr gave a talk on his simplified proof a while ago.

Reactions: |

## Friday, September 16, 2005

### "Threshold"

The most fractious of all is Arthur Ramsey (Peter Dinklage), a dwarf with a penchant for strip clubs who is a brilliant mathematician and linguistics expert. "If our E.T. tries to phone home," Molly tells Cavennaugh, "he'll translate the call for us."Hmm... In any case, other disciplines have been equally abused:

- Dr. Nigel Fenway (Brent Spiner), a former 1960's radical and embittered NASA microbiologist.
- Lucas Pegg (Rob Benedict), a meek, somewhat nerdy physicist who reads First Corinthians on his BlackBerry.

Reactions: |

## Wednesday, September 14, 2005

### Herman Goldstine Fellowship

The Mathematical Sciences Department of the IBM Thomas J. Watson Research Center is pleased to announce its 2006-2007 Herman Goldstine Postdoctoral Fellowship. The fellowship provides an excellent opportunity for research in Mathematics and Computer Science, with exposure to technical problems arising in industry. Details, including application procedure, may be found at

http://domino.research.ibm.com/comm/research.nsf/pages/d.math.goldstine.html

This is a great fellowship for graduating Ph.Ds in theoryCS. I wouldn't be surprised if most of the readers of this blog already knew about it, but just in case..

Deadline is Dec 31, 2005, and applications will be received from Oct 1.

And who is Herman Goldstine ?

Goldstine's career has been long and distinguished. Although his early research was in the area of calculus of variations, during WWII he joined John von Neumann in the groundbreaking ENIAC (Electronic Numerical Integrator And Computer) project. ENIAC was one of the early innovations that paved the way for the modern computing industry. Goldstine joined IBM in 1958 where he established the Mathematical Sciences Department and served as its first director. As recognition for his contributions to IBM and to science he was appointed an IBM Fellow in 1967, and served in this position until he retired in 1973. Most recently he served as the executive director of the American Philosophical Society.

Goldstine has received numerous awards for his impact on science and technology. The National Medal of Science, the Harry Goode Award, the IEEE Pioneer Award, and memberships in the National Academy of Science, the American Academy of Arts and Science, and the American Philosophical Society are among these awards.

Reactions: |

## Friday, September 09, 2005

### SODA 2006

I hope to track down and review some of the papers once they start appearing online. For now, I will entertain you with one of my own (assuming this doesn't break some academic blogosphere code of etiquette ;)).

The title (or at least the first part) is a tip of the hat towards an earlier paper by Friedman and Fisher titled 'Bump Hunting in High Dimensional Data'. The basic setup is this: you're given points labelled red or blue, and a class of shapes (usually hypercubes, but you could have spheres, pyramids, etc). Given a shape, you count the number of red and blue points in it, and compute a discrepancy function that could be as simple as |#red - #blue|. The goal is to find the shape that maximizes this function (i.e, finds the biggest bump).

Where do the red and blue come from ? One of the more popular applications of this problem is in biosurveillance, where the red points represent incidence of disease, and the blue represents a baseline population (the points may be tagged with weights). A bump is an unusually high or low disease rate with respect to the population.

What discrepancy function do we use ? The "simple" combinatorial discrepancy function described above has actually been looked at before, in the context of learning. Here, the red points are "good" elements of a classifier and the blue points are "bad", and the shape is a hypothesis that has classifies the most points correctly while misclassifying the fewest. Dobkin, Gunopulos and Maass presented exact and approximate algorithms for this problem in arbitrary dimensions.

But in the biosurveillance domain, statistical measures are more commonly used. The idea is that we want the discrepancy function to model a likelihood ratio with respect to some underlying population distribution: how likely was it that I saw as many red and blue points that I did ? The resulting function often resembles a Kullback-Leibler distance, or a chi-squared distance, and one of the most popular measures is the Kulldorff scan statistic, developed by Martin Kulldorff in 1997.

Unfortunately, such measures are painful to work with from an algorithmic point of view. They are not as "nice" as the (linear) combinatorial discrepancy, and thus the best algorithms have been either heuristics or (theoretically) brute force methods that have good empirical behaviour.

What we show is that as long as the discrepancy function is convex (which likelihood-ratio based distances tend to be), you can approximately rewrite it as the envelope of a set of linear functions, which you then whack with the DGM algorithm. The question of course is, how many functions do you need ? This is the tricky part, and involves an asymptotic bound on the eigenvalues of the Hessian of the discrepancy function.

This is a fairly general crank that allows us to solve efficiently (and approximately) a number of (till now) different statistical discrepancy problems (including maximizing with the Kulldorff scan statistic) with the same basic method. There are other results in the paper, which I hope to have online soon.

Reactions: |

## Wednesday, September 07, 2005

### Michael finds the elephant

I think STOC/FOCS has become progressively less algorithms-oriented over the years. My impression is that STOC/FOCS emphasizes "mathematical depth", "cleverness", and "theory purity" over, say, "usability" and "insight into a practical problem". I know there are a large number of people who think this, and I would therefore encourage people who don't see this or think it is true to simply consider the issue again. I do not wish to make a judgement call as to whether this is good or bad -- this is just what these conferences, I believe, have turned into.I'd like to point out that there is nothing inherently wrong with FOCS and STOC being exactly the way Michael describes them, as long as these biases are made clear. A conference is defined by its community, and by the papers it accepts. If I know that only "deep/clever/pure" papers should be sent to STOC/FOCS, it makes my life much easier, as long we are willing to relax the idea that only STOC/FOCS must represent the best that theory has to offer.

SODA, not coincidentally, has become a refuge for more algorithmically oriented papers. (As have more specialized conferences.) In some ways, this is good -- these papers have a home! In some ways, it is bad, because I think it has exacerbated the division, which in part explains the extreme opinions the post has raised. (I too have had reviews essentially saying, "This is algorithmic, belongs in SODA." Perhaps the reviewer was just being kind and didn't want to negatively comment on the quality of my work. But my impression is that for many FOCS/STOC reviewers "algorithmic" has itself become a quality comment, in the sense that "algorithmic" papers that do not pass the "deep/clever/pure" test are pretty much dismissed out of hand.)

I have been on committees for FOCS/STOC conferences, and I feel that I have seen these biases in action. Strangely, I did not come away with any ideas on how to change things; once a culture develops, it seems very hard to change.

Reactions: |

## Tuesday, September 06, 2005

### On SODA

Is it truly a good thing to move from a STOC/FOCS/specialized conferences system towards a STOC/FOCS/SODA/other specialized conferences system?

JeffE:

Yes.

I'm surprised there is even a debate over this. Specializing conferences creates more interest, and you know what to expect. If we think of a conference as a group of people with shared interests, what's not to like.

The problem is the traditional role STOC/FOCS have played in "defining theory". I have actually heard arguments that since CG papers are not sent to STOC/FOCS, CG is not theory. It's about time (and in fact this has happened for many years now) that people realize that STOC/FOCS are slowly becoming specialized conferences. Bill Gasarch asks what their specialization is: a lot of it depends on the current hot topics of the day. It seems to me that as an area of theory gets hot, it starts getting more papers into STOC/FOCS, and as the hubbub dies down, the topics move on. SODA on the other hand, has a fairly stable set of topics that are covered, with variations from year to year in terms of which areas are represented more.

Update: Our strait-laced theory community is letting it all hang out. Join in the fun :)

Reactions: |

### New ! FOCS ! Almost as good as SODA !

Reactions: |

## Monday, September 05, 2005

### From the frivolous jokes department

Reactions: |

## Saturday, September 03, 2005

### Tulane students.

Tags: katrina, students

Reactions: |

## Friday, September 02, 2005

### Kepler's, or the end of an era

For me, Kepler's Bookstore defined the quintessential Bay Area experience, before Silicon Valley, before choked highways, and before you could shake every tree in Menlo Park and have a VC fall out. On a bright and cool (and weren't they all) Saturday morning, you'd get on your bike and pedal over (exclusively on bike lanes) to Menlo Park to browse through what was the real "clean, well lighted place" for books. A happy hour, and a much lighter wallet later, you'd take your stash and walk out the door, and sink into a chair outside, holding your double espresso from Cafe Borrone right next door, and let the hours go by in a pleasant haze.

I am an online maven, and an amazon.com addict. I scorn old technology and the old ways, and am constantly restless for change. But I will mourn the loss of Kepler's, because the Bay Area will never be the same for me again.

Reactions: |

## Thursday, September 01, 2005

### No Free Lunch Theorems

I dislike them greatly though (even though in my misguided youth I used to play with GAs) because they are extremely general hammers that typically don't have provably performance or running time guarantees, and don't reveal any interesting insights into the structure of a problem.

Such search strategies are also victims of their own generality, and the dagger that pierces them is called a "No Free Lunch" theorem. No Free Lunch theorems are generally of the form

...all algorithms that search for an extremum of a cost function perform exactly the same, when averaged over all possible cost functions. In particular, if algorithm A outperforms algorithm B on some cost functions, then loosely speaking there must exist exactly as many other functions where B outperforms A.The crucial point here is the "averaged over all cost functions". The NFL theorems are not saying (and this was something I used to be confused about) that search procedures are doomed; rather, they say that blind search procedures (such as those performed by genetic algorithms) are doomed. The original NFL theorem is attributed to Wolpert and Macready; many variants have been proved as well.

Joseph Culberson has a nice perspective on such theorems from an algorithmic point of view, and attempts to frame them in the context of complexity theory.

Tags: genetic, algorithms

Reactions: |