Monday, December 31, 2007

Comte de Buffon, and a happy new year !

My personal resolution for the new year is that I will actually try to write down a larger fraction of the blog posts that are floating around in my head, especially the many that were born while I was sitting in my algorithms seminar.

The NYT has a travel feature on visiting the birthplace of Comte de Buffon, a contemporary and competitor to Carolus Linnaeus, who by this account played Hooke to Linnaeus' Newton. Of course, I (and probably most of my readers) know Buffon not for his work in biology and the natural world, but for the famous problem that bears his name and by some accounts gave rise to the area of geometric probability.

The Buffon needle problem is this: Suppose we drop a needle of length l onto a grid with lines spaced d units apart. What is the probability that the needle touches a grid line ? (let's assume for now that l <> d, the expression is more involved).

The Buffon's needle problem is a textbook example of the area of geometric probability: doing probability theory on entities defined geometrically. This is a harder thing than one might imagine: as Bertrand demonstrated much later on, the very notion of a uniform random sample can be called into question. He showed this by posing a simple question:

Draw an equilateral triangle and circumscribe a circle around it. Pick a chord at random. What is the probability that the chord length is longer than the side of the triangle ?

It turned out that the "Pick a chord at random" step leads to all kinds of problems. Bertrand presented three different equally reasonable ways of picking a chord, and showed that each of the three ways led to a different probability. At the time, this was viewed as a paradox: it seemed strange that what appeared to be a well defined question would lead to three different answers. Of course, the modern view is that there is no paradox, merely a different choice of measure space to work with in each case.

But if that's the case, is there a "natural" choice of measure ? One elegant way of thinking about this was to think of this as a physical process that must obey certain transformation invariants. This leads inexorably to notions like the Haar measure: can we assign a measure to subsets of a group that is invariant over left (or right) group actions ?

The Haar measure has practical utility for anyone working in geometry. Consider the problem of choosing a random rotation (this comes up in when doing Johnson-Lindenstrauss embeddings). In 2 dimensions, this is easy, since the space of rotations is the unit circle. However, in three dimensions or higher, it gets a bit trickier to define the correct measure to sample with respect to. Since rotations form a group, we really need a measure that is invariant under the group action, and a (unique upto scaling) Haar measure is guaranteed to exist in this case, yielding the desired measure to sample from (it's actually fairly easy to generate a random rotation algorithmically: the easiest is to populate a matrix with independent normally distributed random variables, and then orthogonalize it).

But it all goes back to a needle, a grid, and a frustrated naturalist. Happy new year !

Wednesday, December 26, 2007

DARPA Mathematical challenges

Via Ars Mathematica and Not Even Wrong, a DARPA-issued list of challenges in mathematics for the 21st century. Some that puzzled me:
• Computational Duality: Duality in mathematics has been a profound tool for theoretical understanding. Can it be extended to develop principled computational techniques where duality and geometry are the basis for novel algorithms?

I'm not sure what this is trying to say, or whether I'm reading it wrong, because the story of linear programming, primal dual schemes, and the Lagrangian is the story of using "duality and geometry as the basis for novel algorithms"
• What are the Physical Consequences of Perelman’s Proof of Thurston’s Geometrization Theorem?
Can profound theoretical advances in understanding three-dimensions be applied to construct and manipulate structures across scales to fabricate novel materials?

Thurston's geometrization conjecture talks about the structure of 3-manifolds: i.e 3 dimensional surfaces living in higher dimensional spaces ? What kinds of materials could be fabricated using this ?
• Computation at Scale: How can we develop asymptotics for a world with massively many degrees of freedom?

This sounds like there's a nice computational question lurking somewhere, but I'm not quite sure where.
Nice to see algorithmic origami and self-assembly also mentioned. I am particularly intrigued by the reference to the geometry of genome spaces.

Tuesday, December 11, 2007

On the dominance of the Ivies..

Via Cosmic Variance:

Exhibit A:

“One thing we all must worry about — I certainly do — is the federal support for scientific research. And are we all going to be chasing increasingly scarce dollars?” says Drew Gilpin Faust, Harvard’s new president.

Not that Faust seems worried about Harvard or other top-tier research schools. “They’re going to be—we hope, we trust, we assume—the survivors in this race,” she says. As for the many lesser universities likely to lose market share, she adds, they would be wise “to really emphasize social science or humanities and have science endeavors that are not as ambitious” as those of Harvard and its peers.

Exhibit B:
Mario Capecchi: 2007 Nobel Laureate, Ph.D Harvard (1967). Associate Prof. Biochemistry (Harvard) (1969-1971)
Moved to U. Utah (1973).

Through a series of bold experiments begun in the 1980s, Capecchi demonstrated that he could alter any gene in a mouse cell by replacing it with a modified version. At the time, scientists were skeptical that such altered DNA could be targeted to a particular gene. But Capecchi was not to be deterred. Indeed, his studies demonstrated that it is possible to replace an intact, functional gene with a modified version that can zero in on the corresponding DNA in the chromosome.
And finally:

Some 50 universities are located in the Boston area. Rather than collaboration, Capecchi felt that the thousands of researchers were working in isolation on projects that promised "immediate gratification." As he explained, "Everyone is so aware of what everyone else is doing. 'What's new?' was asked every day. That limits you to short-term returns, posing questions that you know can be answered in six months."

In contrast, the University of Utah in Salt Lake City offered "a relaxed atmosphere, where you could work on projects whose outcome may take 10 years. The relative isolation tends to make you more focused on the biological question you're working on.

"It was a good choice," said Capecchi of his decision to relocate to the U of U in 1973.

Friday, December 07, 2007

VLDB dogfight, and peer review

I guess this had to happen sooner or later, given how easy it is to publish a critique. I'm only surprised that more people aren't doing it.

Via Daniel Lemire comes a link to a page by John Wu where he rakes a VLDB 2006 paper over the coals and doing so, mounts a serious criticism of the review process that led to the paper being accepted. I don't have a dog in this fight, and know next to nothing about the area, but the criticisms are prima facie plausible, even if mitigated by the fact that he's the "injured party" in this argument. One idea that he proposes, and that Daniel endorses, is this:
Our first recommendation is for the program committee to assign one member to be responsible for a paper. To encourage committee members to take this responsibility seriously, the accepted paper should be printed with the name of the responsible person. Many journal articles are printed with a footnote to the effect of “Accepted on the recommendation of So-and-So.” Similar footnote could be printed on the final copy of the conference proceedings.
It's an interesting idea. I've heard this being proposed before, and apart from the fact that papers can often slip in without strong champions, I don't see too much that's wrong with it.

Neighborly polytopes and compressed sensing

I ran a seminar on approximate high dimensional geometry this semester, and there are many posts queued up based on interesting observations from the papers that were presented. This post is about polytopes, compressed sensing, and lower bounds for convex hulls.

Compressed sensing is a fascinating topic that (depending on how you look at it) is influenced by approximation theory, sketching techniques, compression, and geometry. Posts by Terry Tao and Igor Carron are good places to learn more about this, and the paper repository at Rice is another great resource. What I wanted to talk about in this post however, is an interesting result by Donoho and Tanner that connects back to structures that geometers are intimately familiar with, as well as helping me understand the nature of convex hull lower bounds.

Let's briefly sketch the motivating problem. Suppose you're given a signal s that you want to express in terms of linear combinations of basis functions drawn from a dictionary. A key idea in compressed sensing is that the dictionary itself is redundant, in that there are multiple ways of representing a signal via linear combinations of basis functions. A useful analogy is that of barycentric coordinates. We can represent a point in the plane uniquely via a linear combination of three non-degenerate points (i.e a simplex), but this stops being true if we have more than three points that form the "basis".

Given that there are multiple representations, which is the best one ? The next important idea is that of sparsity: namely, that the number of coefficients needed is very small compared to the size of the dictionary. Again, returning to our analogy, it would be as if the dictionary had 100 points in it, and all we needed to determine was a representation using a linear combination of three points.

So one rendition of the compressed sensing problem is: Suppose there's a signal that you interact with via a 'measurement matrix' (which means you supply a vector of coefficients, and get the value of the inner product of the signal coefficients with the measurement coefficients; each row of the matrix is one such measurement). You retain the output values y; formally, you can write this as y = Ax, where x is the unknown signal vector. Can you find an x with the fewest number of nonzero coefficients (i.e can you minimize the l0 norm) that satisfies this equation ?

This problem is NP-hard; intuitively, you can encode SUBSET SUM in this problem, by setting up an appropriate y vector. Suppose however that we relax the optimization, and rather than asking to minimize the l0 norm, we ask to minimize the l_1 norm of the target vector x ? It turns out that this problem can be solved via linear programming.

Under what conditions then can the l_1 solution yield the l_0 solution ? Suppose that we could reconstruct the vector x uniquely from y. In that case, it doesn't really matter what cost function we use: if there's only one solution, then we'll find it either way.

This is where neighborly polytopes enter the picture. A polytope in R^d is said to be m-neighborly if any m vertices lie on a face of the polytope. Note that the simplex is always d-neighborly: essentially this definition tries to capture the idea of a "simplicial core" of a polytope: crudely put, an m-neighborly polytope has an m-simplex encoded within it somewhere.

If we think of the measurement y as being a point reconstructed from m coefficients, then if y lies on the m-simplex, there are unique coefficients we can assign to the vertices of the simplex to reconstruct it. Donoho and Tanner showed that this is essentially sufficient: namely that the polytope AT^d (where T^d is the d-simplex) is m-neighborly if and only if there's a unique reconstruction of y (implying that the l_1 optimization yields the desired sparse vector).

This is quite an elegant result in its own right: further, they generalize this to a "robust" variant, where the polytope might be approximately m-neighborly, in which case they show that with high probability a unique solution can be reconstructed (this latter result resembles the Johnson-Lindenstrauss theorem). What's interesting about these results is that they connect a very old concept in geometry (neighborly polytopes) to a reconstruction algorithm in compressed sensing.

A side effect of understanding neighborly polytopes from this perspective is that I finally saw why cyclic polytopes "work" as a lower bound for estimating the size of a convex hull. We can ask what the largest value of m can be for which some d-dimensional polytope is m-neighborly. It turns out that the answer is md/2, and the example is the cyclic polytope.

Why is this interesting ? If we go back to the intuition of m-neighborly polytopes containing an m-dimensional simplicial core, it becomes clear. All vertices of a simplex lie on the convex hull. If a polytope is m-neighborly, it should come as no surprise that its convex hull has size n^m. We thus recover the lower bound for a convex hull size in d dimensions from this fact.

Thursday, December 06, 2007

Leibniz Prize

Via an anonymous commenter comes the news that Susanne Albers is one of the 2008 Leibniz Prize winners. This is the most prestigious science award in Germany (and that's all of science, not just computer science). Going back over the list of past winners, it seems (and everything's in German, so I could be wrong here) that the last theory prize winner was Gunter Ziegler, back in 2001. Susanne Albers, for those who don't know, works in online algorithms, approximations, and algorithmic game theory.

From the Leibniz Prize page:
The Leibniz Programme, established in 1985, aims to improve the working conditions of outstanding scientists and academics, expand their research opportunities, relieve them of administrative tasks, and help them employ particularly qualified young researchers. Up to ten prizes are awarded annually with a maximum of €2.5 million per award. The prize is awarded from a slate of nominations put forward by third parties. The prizewinner is selected by the Joint Committee on the basis of a recommendation from the Nominations Committee for the Leibniz Programme.
The Leibniz Prize is administered by the DFG, which according to Wikipedia translates to the German Research Foundation, which would make it the equivalent of the NSF, and the Leibniz Prize roughly comparable to the NSF's Waterman Prize.

Update: here's the announcement (in German)

Universities, and the people within it, often like to project the image of an egalitarian society. "This is your university", "A university is made up of professors", "WE run the place", and so on. Nowadays, in the era of 'university-as-business-enterprise', these buzzphrases are often supplemented by, "we're all partners in this enterprise", and other business metaphors.

There's one obvious way in which this is fiction: the tenured/non-tenured divide is vast, and dominates the thinking of the untenured in many ways. Continuing the business metaphors, the tenured/untenured divide is like the partner-associate divide at law firms or investment banks.

But factoring that out, there are more subtle hierarchies: the administrative pathway (chair to dean to provost) is one obvious one, and there are class hierarchies based on funding output, fame, and other parameters like that.

In itself, this is not particularly noteworthy: where humans exist, hierarchies are sure to follow. What I find interesting to observe is how the lack of an institutionalized hierarchical structure (that we see in a company) removes some of the obvious external cues that help us slot into our various positions in the hierarchies. In other words, you have to be reasonably socially aware to pick up on the signals that indicate hierarchical patterns in academia, and once you do, you have to constantly remind yourself that they exist, otherwise you run the risk of silently crossing lines that you shouldn't be crossing.

The key issue here is that the natural tendency to form hierarchies is in this case going counter to the professed semi-fiction of equality, and it's the conflict between the two views that makes for an interesting social dynamic.

In all the ink that gets spilled about academia vs industry, this is one of the more subtle issues that crops up. I doubt it makes a difference to anyone's decision on which way to go, but it's curious nonetheless.

Wednesday, December 05, 2007

Tenure-track position in geometric computation at McGill

Contact McGill Geometry for more details. Here's the summary:
The School of Computer Science at McGill University invites applications for one tenure-track position at the assistant professor level, to begin August 1, 2008, in the general area of geometric computation.

The successful candidate must have a strong theoretical foundation and should be active in research on geometric problems and their applications.

Complete pdf format applications, including a curriculum vitae, a list of publications with copies of one or two sample reprints, a research statement as well as a teaching statement, and the names and e-mail addresses of three references should be sent to geometry@cs.mcgill.ca.

Applications will be reviewed as soon as they are received. No applications will be accepted after January 15, 2008.

The School of Computer Science offers a collegial environment with opportunities for interaction with world class researchers in areas including (but not limited to): computational geometry, discrete mathematics, mobile robotics, computer vision , computer graphics, bioinformatics, cryptography and quantum information, reasoning and learning, and scientific computing.

For further information on the School, see: http://www.cs.mcgill.ca.

2007 ACM Fellows

A huge crop of algorithms/complexity folks have been named ACM Fellows in this year's batch of awards: among them
• Rajeev Alur
• Avrim Blum
• Andrei Broder
• Danny Dolev
• Rodney Downey
• Tao Jiang (thanks, Anon)
• Lance Fortnow
• Rajeev Motwani
Congratulations to all !

(12/6): A special congratulations also to Rajeev Motwani (my research "father") and Lance (my blog "father") :)

Monday, December 03, 2007

A question on quantum debugging

As my algorithms class winds down, I've been doing some "topics" lectures, and today I attempted to not make a fool of myself giving students a very high level view of quantum computing, with much help from Dave Bacon and Umesh Vazirani's lecture notes, as well as the Nielsen/Chuang book.

One question came up that I thought was interesting, and I'm hoping that Pontiffs in Shtetls can help answer it :). One of the "features" of programming a randomized algorithm with pseudorandom bits generated by a seed is that you can reproduce a program run by using the same trace. This is of course handy when debugging.

Is there any way of doing this with a quantum algorithm ? I.e is there any way to run a quantum algorithm twice and get the same answer ? maybe using entangled bits in some strange way ?