The Geomblog: 2010

Tuesday, December 07, 2010

ACM Fellows

The ACM just announced the 2010 Fellows, and there are many theoreticians, and many people that I know - congratulations to all !!

Theoreticians on the list:

Also congratulations to Kathleen Fisher, Fernando Pereira, and Lydia Kavraki, ex-colleagues and collaborators.

Tuesday, November 23, 2010

Guest Post: Distinct distances, and the use of continuous mathematics in discrete geometry.

Nabil Mustafa specializes in problems of discrete geometry, and has spent time on problems related to the distinct distance problem. He offers a broader perspective on the new distinct distances result and related breakthroughts in discrete geometry. For a detailed breakdown of the Guth/Katz paper, see Terry Tao's latest opus.As usual, I take responsibility for any mistakes introduced in the editing process.

I think the days when the "Feynman method" was all that was needed to make progress on basic problems in Discrete Geometry are over. Recently there have been a slew of results which make progress on long-standing open problems in Discrete and Computational Geometry that use techniques from a variety of areas in mathematics: algebraic geometry, algebraic topology, complex analysis and so on.

This is wonderful news for the field. Though, two things to consider:

1. So far, apart from the kind of work going in "Computational Topology", this has mostly been a one-way street. There have been fewer cases of discrete and computational geometers going into topology, algebraic geometry etc. and making a similar impact there. Similarly, there are very few collaborations between mathematicians in other areas, and discrete geometers (ed: Mulmuley also argues that the GCT program will only come to fruition when this reverse direction starts happening)

2. From my experience, current graduate students, by-and-large, still seem to be stuck with the general outlook that "If I'm clever enough, and think hard enough, I can solve any problem directly". Such optimism alwayscheers me up. I used to think that too, but the more I learn about other areas, and as the recent work reveals power of techniques, it has become clear to me that it is misguided to think that. I would really advise students to take courses in algebraic topology, differential geometry and so on.

Below I list some such recent breakthroughs.

First-selection Lemma.

Given n points in $R^d$, can one find a point in "many" simplices spanned by these points ?

This has been studied for more than 30 years, with several partial results. The current best result was published this year by M. Gromov, which in fact proves a stronger topological theorem, with better bounds than for restricted earlier cases. Matousek and Wagner have improved this bound slightly for 3D by improving a combinatorial part of Gromov's argument. Gromov's argument is a complicated topological argument that I did not have the background to follow. J. Matousek gave a nice talk at the Bernoulli conference in September with the title "Also Sprach Gromov"!

2. Colored Tverberg Theorem.

Let $C_1,\cdots,C_{d+1}$ be disjoint subsets of $R^d$ , called colors, each of cardinality at least $t$ . A $(d+1)$ -subset $S$ of $\bigcup^{d+1}_{i=1}C_i$ is said to be multicolored if $S\cap C_i\not=\emptyset$ for $i=1,\cdots,d+1$ . Let $r$ be an integer, and let $T(r,d)$ denote the smallest value $t$ such that for every collection of colors $C_1,\cdots,C_{d+1}$ of size at least $t$ there exist $r$ disjoint multicolored sets $S_1,\cdots,S_r$ such that $\bigcap^r_{i=1}{\rm conv}\,(S_i)\not=\emptyset$

The conjecture is that $T(r,d) = r$, and this was proved recently via topological arguments (for all $r$ such that $r+1$ is prime) by Blagojevic, Matschke, and Ziegler (see Gil Kalai's blog for a detailed post on this). Matousek, Tancer and Wagner have translated this argument to a geometric proof. As they state in the abstract, "The purpose of this de-topologization is to make the proof more concrete and intuitive, and accessible to a wider audience."

3. Distinct Distances problem.

The Guth-Katz dramatically improves the best known bound via techniques from algebraic geometry. Terry Tao has more details on this.

4. Joints problem.

A joint is a point formed by the intersection of three non-coplanar lines in 3D. What is the maximum number of joints achievable by a collection of $n$ lines in 3D ?

This was again solved by Guth and Katz via techniques from algebraic geometry. It was subsequently simplified wonderfully by Elekes, Kaplan and Sharir, and Kaplan, Sharir and Shustin.

5. Lower-bounds for eps-nets.

It was very widely believed that for "natural" geometric objects, eps-nets of linear-size should exist. Shockingly, Alon showed that an easy application of Hales-Jewett density version immediately gives a non-linear lower-bound for a simple geometric-system in the plane. While DHJ is a combinatorial result, it was first "proved by Furstenberg and Katznelson in 1991 by means of a significant extension of the ergodic techniques that had been pioneered by Furstenberg in his proof of Szemeredi's theorem". Like earlier proofs, it is possible to "de-ergodicize" it (the polymath project).

6. Regression-depth partitioning conjecture.

(ed: see here for a description of regression depth - it's similar to halfspace depth)

Partial results were shown by Amenta-Bern-Eppstein-Teng in 2000. Recently almost proven by Karasev using topological techniques that I do not understand.

This is just a partial list, there are probably several others that I have missed.

Monday, November 22, 2010

Workshop on Parallelism, and a "breakthrough" in combinatorial geometry

In the blog world, you're either fast, or dead. I waited a few days to post a note about the upcoming DIMACS workshop on parallelism, and was beaten to the punch by Muthu and Dick Lipton.

At this point, you probably know the major points. Phil Gibbons, Howard Karloff and Sergei Vassilvitskii are organizing a DIMACS workshop on a long-range vision for parallelism, with an emphasis on interaction between practitioners and theoreticians in the area. The workshop looks to be quite interesting, and similar in spirit to the one organized by Uzi Vishkin at STOC 2009 on multicore algorithms.

Utah has a major presence in the world of high performance parallel computing, and we're hiring in this area this year ! From my vantage point, I get a fly-on-the-wall view of developments in the area, and it makes me wonder:

Is the field is now moving too fast for models to even make sense ?

A case in point: Google's MapReduce infrastructure has taken the world by storm, and you can even run MapReduce "in the cloud" on Amazon's servers. Howard, Sergei and Sid Suri had a very interesting paper on MapReduce at SODA last year, and there's a ton of academic work on algorithm design (in theory and practice) in this model. But if new reports are to be taken seriously, Google itself is moving on from MapReduce to other distributed data management systems.

The most recent "alternative model of computing" that we've encountered is the stream model, and while it had strong links to practice, it's survived this long because of the incredibly interesting theoretical challenges it poses, as well as the rich arsenal of theory concepts that got deployed in the process. I'm curious to see if something similar can happen with these new models of parallelism and multicore computing (over and above PRAM and NC of course)

In other news: readers of Gil Kalai's blog will have heard of the exciting new breakthrough in the realm of combinatorial geometry on the tantalizingly simple problem first posed by Erdos in 1946:

What is the minimum number of distinct distances achievable by a set of n points in the plane ?

I hope to have a longer guest post up soon on this, so I won't say much right now. What's neat is that this result is the implementation of a larger program/line of attack laid out by Gyorgy Elekes, who sadly died in 2008 before seeing this come to fruition. Micha Sharir has a beautiful writeup with the history of this approach (it predates the new results though). Bill Gasarch has a nice page collecting together the major developments in this area, including links to the Kakeya problem.

p.s Note to other CG bloggers: please do not talk about this problem otherwise Mihai will scold us all ;)

Friday, November 12, 2010

What can the ACM do for you ?

I kid, of course. While the ACM is a popular punching bag in hallway conversations, they do help the CS community in many ways, not the least of which is the digital library, the conference management services, the lobbying efforts in Congress (where I think though that the CRA has them beat), and so on.

But there's this constant running joke that the ACM is always trying to drag us kicking and screaming into the 1970s :). More importantly, while the DL has been spiffed up with comment feeds, all kinds of social sharing and what not (but no RSS for comments - COME ON !!), I think that the ACM could use its considerable weight to really help the research community. In both of the ideas I'm listing, the ACM has particular advantages as an organization with brand recognition, and an imprimatur of authority. They also have resources far beyond what other groups might be able to do.

A Pubmed for CS
More than six years ago, I was complaining about the lack of a Pubmed-style single location to dump all CS papers (titles, abstracts and keywords, with links back to original source). While the arxiv is becoming a better and better place for people to post preprints, what we really need is a single point to view and browse published papers, across journals and conferences.

The DL already maintains paper listings across a variety of publications, and many CS conferences are run by the ACM, so it shouldn't be that hard to do. The DL itself isn't quite there yet, mainly because of the lousy search features that it has, and because it's not quite complete.

I don't know if Dan Wallach's proposal for a central hub for papers is going anywhere, but that's another model that the ACM could help make a reality.

A Mathematical Reviews clone
This one is possibly more localized to theoryCS, and requires more resources. But it's getting ever more important. It's really hard to keep track of the flood of papers showing up in conferences, journals and on the arxiv, and a service that generated short reviews of papers would be great. Plus, as theoryCS gets more fractured and more diverse, this kind of thing becomes ever more important.

It seems like the ACM is getting the 'social bug' if the DL redesign is any indication. I'd argue that these two items are probably the best kind of 'social web' that the ACM can contribute to.

Friday, November 05, 2010

Odds and Ends

Glencora Borradaile makes her recommendations for FOCS videos to watch (can we have all our conferences do videos, please ? My 40 minutes on the bus will take on a whole new dimension). I like the idea of starring speakers who do a good job: might increase the competitive pressure. Come to think of it, here's another perk of having videos for conferences: sheer embarrassment will make sure that speakers improve their presentations.

The cstheory Q&A site rolls on. Those of you at FOCS might have noticed the little sheets in your registration packets, but I don't know if it actually encouraged new people to visit. If you're one of them, do drop a note in the comments. Three points of interest:

We have a collaboratively edited article that we just submitted to SIGACT News (you can read the draft here). The article highlights some of the recent fascinating questions, answers and discussions on the site - do check it out
This is purely anecdotal, but I've been hearing both at FOCS and on the site that people are having to avoid visiting the site because it's so addictive ! Don't worry - it isn't that bad - as a matter of fact I only spent 6 hours last night ! down from.. (never mind)..
Another sign of catastrophic success: our first incident of undergrads in a theory class systematically trying to get their homework questions answered by posting on the site. Many alert users were able to close down the questions, but occasionally answers slip by. If you're an undergraduate and read this blog (HA ! blogs are for old fogies !), be warned...

Finally, because I'd be remiss not to do so, some neat questions (and answers) to chew on:

What's the real complexity of computing the GCD of two numbers ?
How hard is it really to compute the eigendecomposition of a matrix ?

Thursday, November 04, 2010

All FOCS talks are online

All FOCS 2010 talks are now online, at ieee-focs.org and http://techtalks.tv/focs/2010. The player works fine on firefox (the picture is embedded in one corner of the slides) and I find the talks particularly pleasant to listen to if you have a nice backing track :)

Thursday, October 28, 2010

FOCS Day 2: (much delayed)

I had to return to Salt Lake the evening of day 2, so blogging got left by the wayside. Piotr is to blame for this post :).

As usual, by day 2 (day 3 for me because of the tutorials day), my desire to attend talks far outstripped my ability to attend them, and I especially wanted to hear more about the Kumar/Kannan clustering with spectral norm paper, and the stability for clustering result of Awasthi, Blum and Sheffet. If I ever have control of setting the program for a conference, I'm going to decree that all talks start no earlier than 11am (they can go late in the evening, for all I care).

The first talk that I attended was by Alex Andoni on the new polylog approximation for the edit distance that he has with Robert Krauthgamer and Krysztof Onak. Dick Lipton actually spends a good deal of time on the problem and background in this post, and also highlights their earlier paper that gave a sub-polynomial appproximation in subquadratic time.

This paper goes one better, producing an algorithm that runs in time $O(n^{1+\epsilon})$ while yielding an $\log^{1/\epsilon}$ approximation. A key idea in this result is a very neat trick: rather than treating the strings x,y symmetrically, the algorithm first compresses x down to a much smaller string (of length n^\epsilon), and then runs a crude exact algorithm on this compressed string and y.

This compression is randomized: the algorithm subsamples small pieces of x. However, a straightforward sampling doesn't preserve the distance, so the algorithm actually starts by designing a careful hierarchical partitioning of x which induces a modified distance function (that they relate to the edit distance). Then the sampling takes place in this tree. Obviously I'm skipping many of the details that I didn't get, but Alex gave a very nice talk that laid out the hig level picture (his talks are always a model of clarity and intuition).

Another talk that I found intriguing was the work by Saks and Seshadri on computing the longest increasing subsequence of a string. While there's a straightforward dynamic program for this problem, they showed that the length of the sequence could be estimated in "sublinear" time (in the size of the program). In fact, their technique is quite general, and might lead to a general method for approximating the output of a dynamic program in time sublinear in the size of the table. A key structural property that they develop is that either the dynamic program can easily be broken up into two balanced subrectangles, or it can be subsampled O(log n) times, yielding smaller copies that can be combined to find a solution (it's possible I'm misrepresenting the last bit). Sesh gave an excellent talk, but his taste in talk jokes was so terrible it was actually funny :) (sorry, Sesh!)

I deeply regret having to leave before the evening plenary session, and especially regret missing Dan Spielman's talk, which by all accounts was magnificient. All the talks were videotaped, so they'll eventually be online and I can watch it then.

Monday, October 25, 2010

FOCS Day 1: some notes

Three newsworthy items from the FOCS business meeting: (for more details follow the focs2010 tag on twitter)

Petros Drineas is now an AF program director at the NSF. Congrats, Petros !
STOC 2011 might experiment with a poster session with a different call for submissions. That's a great idea - I hope they decide to go ahead with it
Not one person felt that FOCS should continue to have printed proceedings. As recently as a few years ago, at least a third of the audience at theory conferences would have asked for printed proceedings. Times they are a'changin' !!!

Journals, Conferences, and non-galactic algorithms

Does our lack of attention to journal publications influence the "impracticality" of our algorithms ?

I've been thinking about algorithms, applications and the dreaded "practicality" issue. And the following thought occurred to me.

We have a well-known aversion to journal papers in our community. I won't rehash the arguments for and against. But it is definitely true that in a conference paper we elide many details of the algorithm construction and proof. It's not uncommon for a 10-page conference paper to blow up to a nearly 100 page journal paper (cough cough MST cough cough).

So suppose you have this snazzy new algorithm for a problem. You're happy with your handwavy conference version, and you don't bother to write the journal version. When you do, you realize to your horror that your snazzy 1-page procedure has become this 12 page lumbering beast of an algorithm that's so insanely complicated that you'd be ashamed to even call it 'galactic'. Now this might motivate you to come up with a simpler algorithm, or it might motivate someone else !

But if you don't write the journal version with full details, how is anyone to know ?

While this is in part about practicality (since practical almost always equals simple), it's more than that. Clarity in algorithm design is useful even if there's no practical value, because it aids in understanding. So this is in my mind a completely different reason for why we should write more journal papers. It might make our algorithms simpler !!

p.s Yes, I am aware that very often the hard part of a paper is PROVING that an algorithm is correct, and not the algorithm itself. Obviously this would not apply to those cases. But even there, having to contemplate the monstrosity of the proof you end up writing would motivate you to simplify it. After all, proofs are about understanding, not fireworks and fountains at the Bellagio.

FOCS Day 1: Clustering

A long time ago I realized that if I took blogging duties too seriously, I'd be exhausted trying to report on talks that I attended. So I'm not going to even try. I will talk about a pair of talks on clustering though.

A core question of clustering is this: learn a mixture of $k$ $d$-dimensional Gaussians with different means and covariance matrices from samples drawn from the mixture.A long line of research, starting with a paper by Sanjoy Dasgupta back in 1999, produced variations on the following result:

If the Gaussians are "well separated" and have "nice" shapes, then a "small number" of samples suffices to determine the parameters of the system.

Here, "parameters" would be means, covariance matrix elements, and relative weights. As time went on, the separations got smaller, the niceness reduced, and the "small number" got better.

These two papers take a slightly more general tack, in different ways. They both eliminate the conditional nature of prior bounds ("if the Gaussians are well separated"), instead replacing this by an unconditional bound ("poly in a parameter capturing the separation").

The first, by Moitra and Valiant (both Ph.D students!) measures the separation between the distributions via their statistical distance, and gives a bound for learning in terms of the reciprocal of this distance. Their main insight is to project the distributions onto a few directions (i.e 1D), learn what they can, and then recurse on the data that doesn't appear to be well-separated enough to learn properly. This takes a lot of technical skill to do correctly.

The second paper, by Belkin and Sinha, takes a different approach. They pay more attention to the parameters of the distribution. This can be a problem in general because closeness in distribution does not imply closeness in parameter space. However, they show that the set of all distributions with fixed moments forms a nice algebraic variety, and that the above problem can be quantified by determining how closely this variety folds in on itself (intuitively, if the variety intersects itself, then two far away parameter sets yield the same distribution). Using some standard (but not trivial) results in algebraic geometry, they can bound the complexity of the resulting structures (so that they can estimate the number of samples needed to get to a distiinct region of the space). There's more detail involved in the estimation of parameters and moments and so on.

While both papers are impressive achievements that essentially resolve the theoretical aspects of learning mixtures of distributions, I came away not entirely sure how to compare them. On the one hand, the Moitra/Valiant paper is more explicit and is likely to be less "galactic" in nature. But it appears to be limited to Gaussians. The Sinha/Belkin paper is more general, but uses more abstract machinery that is unlikely to be useful in practice.

So what remains is stilll an effective way of learning Gaussians or other families of distributions, and also a better understanding of how these bounds relate to each other.

Sunday, October 24, 2010

FOCS Day 0

I'm in Las Vegas for FOCS 2010. It's been a while since I've attended FOCS, and the short distance from Salt Lake was definitely a factor.

Kudos to the organizing committee for the tutorial sessions today. I've long whined about the lack of non-paper-presenting events at theory conferences, and the topics for today were great ones. Unfortunately, a spectacular debacle on the part of the hotel restaurant meant that I (and 11 other poor souls) missed most of Mihai's tutorial on data structures.

But the first tutorial by Ketan Mulmuley was fantastic. Let's be clear. It's impossible to convey any depth of material on GCT in 1.5 hours, but he hit (to me) just the right level of high level exposition and low-level technical conjectures. I also learnt something brand new from the presentation, which I'll try to summarize.

In Godel's incompleteness theorem, we all know that the key step is phrasing a statement of the form "I am not true". The self-referentiality of this statement is what drives the paradox within the logical system that allows you to frame it.

The part that's important to remember is that the logical system has to be powerful enough in the first place, so that it's possible to even do self-referencing. A weak logical system can't do that, which is why Godel's theorem only kicks in once the system is strong enough (a very Greek tragedy concept this, that it is your own power that sows the seeds to destroy you).

It turns out that a similar mechanism comes into play with P vs NP. Mulmuley starts with a "trivial proof of hardness" for a function f. Write down all circuits of small complexity, and for each circuit write down some input for which C(x) != f(x). This table is of course of exponential size, but never mind that.

He now introduces the flip operation. This is the idea that instead of trying to manipulate this large table, we actually store a smaller table of certificates, with the property that from one such certificate, you can reconstruct efficiently a set of circuit and witness pairs from the table, and the set of such certificates cover the entire table. You also need to able to decode these certificates efficiently and quickly verify that an input produced in this manner actually does demonstrate a counter example.

There's a problem with the flip operation though. We're using this flip structure to show that P != NP (nonuniformly for now). On the other hand, the success of the flip operation requires us to construct a mapping from the certificates to the table that can also be verified in polynomial time. But since f is an NP-complete problem, what we are essentially saying is that we can efficiently construct short witnesses of membership or non-membership, which means that we've collapsed P and NP !

You could of course retort by saying, "well I don't want to use the flip operation then !". It turns out that you can't ! He states a result saying that ANY proof separating P and NP will essentially look like a flip-based proof. So no matter what you do, you have to deal with the self-contradicting structure of the flip. The geometric construction (GCT) is his program for constructing the flip so that this cycle of referencing can be broken.

But that's not where the analogy to Godel's theorem comes from. When we start digging into separating the permanent from the determinant, we can translate the problem into an embedding problem over algebraic varieties. This is because both the permanent and determinant have unique characterizations as the only polynomials satisfying certain properties over matrices.

Now the separation problem can be framed as the problem of finding an efficient obstruction that demonstrates the separation between these two varieties. What's even more bizarre is that ANY proof that separates the permanent and the determinant will end up producing such an algebraic-geometric obstruction, even if the proof itself doesn't use algebraic geometry.

So let's review. Any proof of P vs NP will induce a "flip" construction that generates small certificates that explicitly show hardness of a problem. The permanent is one such hard problem, and lends itself to an invariant representation that yields a key obstacle that we need to construct in polynomial time in order to make the flip work.

In other words, we need to be able to construct an object in polynomial time (that to date we don't know how to do in less than triply exponential time) in order to show that these very same poly-time computations CANNOT solve the permanent.

And this is the core paradox at the heart of the program. Our inability to prove that P != NP is because we can't show that P is strong enough to express within itself the certificate that it is NOT strong enough to capture NP.

Mulmuley makes another very interesting point at the end of the tutorial. Many people have wondered why the GCT framework can't be used to prove "easier" bounds for other separations. While he mentions some examples of problems (matrix multiplication) where people are attempting to use the GCT framework to show poly lower bounds, he also points out that it is only because of the beautiful invariant representations of the permanent and determinant that we can bring the machinery of algebraic geometry to bear on the problem. As he says, we got lucky.

Wednesday, October 20, 2010

On outreach to applied communities

There's an argument playing out on cstheory (yes, we need a better name) on whether we should encourage or discourage questions like "how can I model X phenomenon theoretically", or "Are applied questions off-topic"

While Scott Aaronson posted an eloquent defense of why we should always entertain such questions, the general tone of the responses has been negative. The main fear has been a 'slippery slope' - that if we allow one modelling question, then the site would be overrun by modelling questions.

I think this sentiment is particularly interesting in the context of David Johnson's post requesting examples where approximation algorithms were deployed usefully in practice (and that he couldn't come up with many good examples)

The truth is that outreach to more applied communities is not about taking your algorithms hammer and beating their heuristic over the head with it. As I've said before, you have to have a DRAMATIC improvement in performance before people will change their standard practices.

It's firstly about communicating a style of asking and answering questions, even if the actual answers are "trivial". A collaboration where you persuade someone to use max flows might not be theoretically very splashy, but it has a huge impact in how people think about algorithms, theory and problem solving.

Doing this first prepares the ground for future collaborations. If you want someone to use your new and snazzy approximation for vertex cover, you have to first convince them that vertex cover makes sense, and before that you have to convince them that even modelling the problem as a graph makes sense. Believe me when I say that the very notion of defining a problem formally before solving it is not how work gets done in many applied areas.

Michael Mitzenmacher makes a similar point in a comment to the above post, where he says:

I do think, however, that sometimes the focus of theorists -- to get the best constant-approximation or to shave log or log log factors off -- obscures rather than highlights a good part of what I see as the real "practical value" of this line of research. Focusing on what algorithmic ideas underlie the approximation, and more implementation and experimentation, would (I think) give more real-world power to this theoretical work.

Tuesday, October 19, 2010

Practical Applications of Approximation Algorithms

(Guest Post by David Johnson)
In my Knuth Prize Lecture on approximation algorithms at the last STOC conference, I celebrated the many great theoretical advances we have seen in the area of approximation algorithms over the last four decades, but lamented the apparent paucity of practical applications of this work. We motivate our work in this area as a way of coping with NP-hardness, but how many people who actually have to do such coping in practice are
using our ideas?

In my talk I could only come up with a few examples, these from our work at AT&T Labs - Research, where we have adapted the Goeman-Williamson primal-dual approximation algorithm for the Prize Collecting Steiner Tree problem for use in tools that help design access networks, and where exponential potential function methods for approximating multicommodity flows are being exploited in a proposed content-distribution scheme. I also noted that of the 14 Kanellakis "Theory and Practice Awards" given out so far, of which some 6 or 7 have been for algorithms, only one, that to Freund and Schapire for their statistical "boosting" technique, could be viewed as related to approximation algorithms, and the boosting research is typically classified instead as Learning Theory.

So, I think many viewed my talk as a rather negative exercise. Certainly, practical impact is only one dimension along which to evaluate a field, and, if we relied only on practical motivations, we would not have developed the deep and informative theory we now have. However, practicality IS an important dimension, and I hope that my inability to identify many examples of practical impact was more a fault of my own ignorance, than one of the field. Giving a talk about this to a primarily theoretical audience did not develop many new leads, so I'm hoping that a blog posting will be more successful.

In particular, I would like to collect

(A) Stories about algorithms or algorithmic approaches developed by approximation algorithm theorists that were implemented and applied, either directly or in modified form, by people to solve real-world problems.

(B) Stories about the analysis of algorithms developed by others (for example, the greedy algorithm for SET COVER) that influenced the use of these algorithms in practice, or at least helped explain their performance and hence was cited in "Applications" papers.

The algorithms need not be restricted to NP-hard problems -- I am also interested in approximation algorithms for problems with polynomial-time optimization algorithms that can be too slow for some practical applications.

For both types of stories, I'd like as much detail as possible - who did the work, where, etc., and citations if known. You can post your nominations/comments on this blog, or send them directly to me (dsj@research.att.com). If you post as a comment, please do not do so anonymously, as I may want to contact you for more information.

Assuming I get a good response, I plan to prepare a survey paper (or at least another blog posting) summarizing what I learn.

Thanks! David Johnson (dsj@research.att.com)

SoCG deadline, and a new NSF rule

SoCG CFP is out:

Titles/abstracts: Nov 22
Papers: Dec 1
Notification: Feb 14
June 13-15, conference.

As Sariel noted, the CFP makes an explicit request for more diverse submissions and plans to accept more papers:

Anticipating the usual high overall quality of submissions, the program
committee intends to accept more papers than was typical in previous years. Also, we intend to interpret the scope of the conference broadly, and will accept all papers of high quality that are of significant interest to our research community. The conference venue in Paris is attractive and allows a bigger conference than previous years.

My optimistic side is excited by this, and I really hope we can see more papers on diverse topics. My pessimistic side is grumping "yes, that means any papers that aren't mine". sigh...

In unrelated news, the NSF appears to have a new policy for 'data management' and will require a new supplementary document to this effect. Note that since it's a mandatory document, it will affect theory proposals as well:

Plans for data management and sharing of the products of research. Proposals must include a supplementary document of no more than two pages labeled “Data Management Plan.” This supplement should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results (see AAG Chapter VI.D.4), and may include:

* The types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project

* The standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies)

* Policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements

* Policies and provisions for re-use, re-distribution, and the production of derivatives

* Plans for archiving data, samples, and other research products, and for preservation of access to them

Tuesday, October 05, 2010

Absolutely terrible news: Partha Niyogi passes away

(Via Lance) Partha Niyogi just passed away.

I first met Partha at SoCG 2007 in Gyeongju, where he gave an invited presentation on geometric perspectives in machine learning. We also spent a very pleasant day playing hookey from the conference, wandering around the nearby tourist spots, and talking about geometry and machine learning. He was a brilliant researcher with an impatience and drive that led him to do some truly excellent work.

Anyone who's ever dabbled in machine learning knows about the strong connections between ML and geometry, dating back to the very notion of VC dimension. But while the connections between the areas are manifest and plentiful, there haven't been many people who could successfully navigate both realms. Partha was one of those unusual people, who coupled deep mathematical insight with a strong vision for problems in ML. Most recently, he organized a wonderful summer school in machine learning and geometry, whose syllabus is worth perusing to get a sense of the depth of this interaction.

In many circles, he's most known for his work in manifold learning, where in collaboration with Misha Belkin, Steve Smale and others, he laid out a series of rigorous results on the problem of how to "learn" a low dimensional manifold given access to samples of points from it. The work on Laplacian Eigenmaps is a key marker in this area, and has had tremendous influence in the realm of "non-Euclidean" data analysis. Among his other recent contributions in theory-land are a neat result from FOCS 2006 with Belkin and Narayanan on sampling the surface of a convex polytope using heat flow and the Laplace operator.

It's a great loss.

Monday, October 04, 2010

We're hiring !

The School of Computing is looking for three ! count 'em ! THREE ! tenure-track faculty this year, in a broad search with interests in:

algorithms, concurrent software, data bases, data mining, formal verification, large data analysis, machine learning, performance verification, parallel systems, robotics, and theory.

We're looking for interdisciplinary folks with interests in possily more than one of the above, and there are nominally three "areas of interest" that we are looking in: parallel computing, robotics and big data.

For theory folks, probably the most relevant of the three areas of interest is the big data search, especially if you're interested in data mining/machine learning/statistics and other forms of data analysis. But if you have strengths in any of the other two areas, then make sure to highlight that.

Email me if you have any questions about this. I'm very excited about growing our large-data program by getting strong algorithms folks. rec

Monday, September 27, 2010

FOCS travel awards and registration deadline: Apply SOON !

Paul Beame reminds me that the FOCS early registration deadline and hotle registration deadline are coming up soon (this Thursday). Don't forget that FOCS has three tutorials on Saturday before the conference starts, by Ketan Mulmuley, Mihai Patrascu, and Tim Roughgarden (which will be on geometric complexity theory, algorithmic game theory, and how to annoy everyone with your blog - sorry Mihai, I'm joking :))

Travel awards are also still available for the conference: Quoting Paul (emphasis mine):

There is still a bunch of NSF travel support money available for students/postdocs because we can only fund one person per US institution. All the award decisions have been made at institutions where someone applied by Sept 23rd but we are still open to new applications for travel support (not including registration) from other institutions. There is no hard deadline but people should please apply ASAP (ideally by the end of this week) because we will make final decisions on awards soon. Follow the Travel Support link at: http://www.egr.unlv.edu/~larmore/FOCS/focs2010/

p.s cstheory Q&A has been sucking up all the time that I should have been spending blogging ... ahem.. doing research. We're at 1500+ users and counting, so come on in and join Lance's students and all the others :).

Thursday, September 02, 2010

Geometry @ Barriers

Bill Gasarch's summary of the Barriers II workshop omitted discussion of the geometry session, and Piotr Indyk kindly agreed to write a brief summary for your entertainment. Here it is, lightly edited and with links added. If there are talks I'm missing links for, do let me know and I'll add them in.

The "(Geo)metric Algorithms" workshop took place on Monday, August 30, at Princeton. It was the last day of the Barriers II Workshop, dedicated to geometric and metric algorithms.

Sariel Har-Peled was the opening act. He spoke about various structures in geometry that approximate properties of large point-sets, such as epsilon-nets, epsilon-approximations and core-sets. He started by stating that finding a needle in a haystack is a problem with many solutions, e.g., one can burn it down and sift through the ashes. He did not implement any of the solutions though (and the Princeton Fire Department was very grateful for that). He finished by asking for which problems we can (efficiently) find short certificates of near-optimality.

Alex Andoni spoke next, about high-dimensional nearest neighbors. He gave an overview of the state of the art, for Euclidean and other spaces. He asked whether we can construct a partitioning of R^t into "round" cells that supports point location in poly(t) time. Such structure could make the recent theoretically-optimal Locality-Sensitive Hashing scheme practical. He was also wondering whether we can find better embeddings of various metrics (edit distance, Earth-Mover Distance) into various simpler spaces (L1 or product spaces).

Jeff Erickson was the last one before the lunch. He spoke about computational topology, focusing mostly on algorithms for bounded genus graphs. He covered shortest paths, maxflow and mincut, charming the audience with exquisitely drawn multi-dimensional homology diagrams, and a mesh dragon (ed: what ? no bunnies?). He presented an algorithm for max flow with running time of the form (quote) "n UglyPoly(g,log n, log C)" where g is the genus, C is the capacity bound and n is the number of points. He was wondering whether there was any way to make the bound less ugly, perhaps by making the algorithm combinatorial.

After lunch, Anupam Gupta spoke about doubling metrics and their algorithmic applications. Surprisingly many geometric algorithms can be generalized to work for arbitrary such metrics. In his talk, Anupam covered labeling schemes and optimization problems. He finished by asking whether there is a PTAS for TSP in doubling metrics, generalizing the result by Arora-Mitchell for the Euclidean spaces.

Finally, James Lee talked about the sparsest cut problem and its generalizations. Progress on this problem has often led to many fascinating developments in mathematics and TCS. The problem is closely related to embeddings of certain classes of metric spaces into L1. The main question now is to determine whether the best known approximation/distortion bounds for these problems ($\sqrt{\log n}$) are optimal.

All in all it was a fun event (although I helped organizing it, so what do I know). The videos of the talks will be available shortly.

No haystacks were harmed while filming this workshop.

Wednesday, August 25, 2010

Latest news from the unnamed-but-working-on-it theoryCS Q&A site

4 days in (public) beta, and the theoryCS Q&A site (or as someone described it, the 21st century Usenet) is chugging along. There are nearly 600 users now, with nearly 300 "active" users - people who've posted, asked a question, commented, etc. Since there isn't a way to broadcast to the user community directly, I'm taking the liberty of doing it here.

If you're enjoying using the site, please visit the meta site (click the small 'meta' on the top banner). If you don't like how the site is working, please visit the meta site. In both cases, you'll find important discussions about how the site is working and should work, and your input will shape how it evolves. For example,

What's the appropriate level for questions on the site ?
How should questions be tagged (for easy searching and indexing)
Who should be moderating ?
Which questions should be closed (deactivated) if any ?

and almost as important:

What should the site name be ?

Another reminder: please use your votes liberally. In fact, if you think a question is worthy of being read, it probably deserves an upvote (and the same for an answer). Frequent voting makes the whole system run smoothly.

And now, back to your regularly scheduled theoryCS programming: some interesting recent questions for your entertainment (please comment at the question links, rather than here):

Are there good constructions of space-efficient expanders ?
Can you design a heap with a stable ExtractMin operation (without using time-tags)?
When is polynomial testing equivalent to polynomial evaluation ?

Monday, August 23, 2010

cstheory Q&A site now open to the public !

The theoryCS Q&A site, in private beta the last seven days, is now open to the public. Please visit, ask questions, answer them, and hopefully we can make this a valuable resource for the community. There are currently over 100 questions, 300 answer, and 300+ users ! Some of the questions that I thought were quite interesting:

Analogs of compressed sensing
The importance of Integrality Gap
Surprising Results in Complexity
How hard is unshuffling a string (now with a bounty!)

If you have questions about the site and the protocols for questions, please visit the meta site. There, a bunch of users are trying to flesh out what will be the policies, as well as elect temporary moderators, and so on.

Monday, August 16, 2010

cstheory q&a site now open !

After a long wait, the theoryCS q&a site is now open. It's in private beta for the next 7 days, so if you didn't commit already, you'll have to wait till next Monday.

Please visit the site, ask your questions, and answer other questions. Some points to note from experience with Mathoverflow:

If you're so inclined, please participate in the meta site as well. That's the place for discussions about the site itself, and is a good place to get involved and help shape the community.
Please vote liberally and often. Especially in the beginning, voting questions and answers up when appropriate helps encourage others to stick around and participate more. If a question seems interesting enough to read, it deserves an upvote !

Saturday, August 14, 2010

Social theoryCS...now 98% better...

Whatever you may think of the P vs NP hullabaloo over the past week, it has to be said that as a social experiment in peer review, this was an exhilarating and magnificent process to watch. There's enough chatter in the media about how this might point the way forward for peer review, and I'm skeptical of drawing too many conclusions from this one experiment with arguably the most important problem in our field.

But I'm encouraged by both the level of participation, and the enthusiasm it has generated about the field. I can't troll through the over 800 comments to find them, but I read numerous comments about how watching this process unfold made the commenter ever more excited about getting into theoryCS. Bob Sloan made a brief cameo to point out the value of this discussion as an outreach tool.

This is a long way of coming around to the news that the theoryCS Q&A site is now at 98% commitment, which means that some time in the next few days it'll go into beta. The beta stage starts with a 'private' seven day period, in which only those who committed (and therefore have promised to answer/ask a total of 10 questions over the entire beta duration of 90 days) get to use the site. After that seven day period, the doors will open for any interested participants.

The initial seven day period will have strong administrative elements (seeding the site with questions, discussing meta issues about the scope of questions etc), and while no major issues will be resolved, you can have some influence about how the forum gets shaped if you get in early. So if you haven't committed yet, do so, otherwise you'll have to wait the 7 days to get in.

Wednesday, August 11, 2010

P vs NP: What I've learnt so far...

(ed: note. it is somewhat embarrassing to be categorized as someone "trying to unravel what is up with the claimed proof". All I've really helped with is getting the wiki set up, and I'm very impressed with the amount of work that everyone is putting into keeping it current)

Whether or not Deolalikar's proof turns out to work, I have to say that in a span of three days, I've learnt a heck of a lot about the individual components in his paper.

Like everyone else, I learnt my descriptive complexity back in the day, and marvelled at the fact that P, NP and PSPACE could be characterized syntactically. The discussions about the subtleties of the order relation in LFP have been fascinating though, and while the issue remains unresolved, there's a wealth of new(er) material that I've now been exposed to.
Similarly, while I was familiar with the idea that hardness could be captured as a phase transition in random k-SAT, the bizarreness of the landscape (clustering! freezing! condensation!) was quite new to me, and it's been illuminating to learn from the experts.
The relationship between the solution space for k-SAT and error-correcting codes is also something that I was not aware of. And it might be the source of yet another issue with the paper.
A point made by Terry Tao is even more intriguing:

If nothing else, this whole experience has highlighted a “philosophical” barrier to P != NP which is distinct from the three “hard” barriers of relativisation, natural proofs, and algebraisation, namely the difficulty in using average-case (or “global”) behaviour to separate worst-case complexity, due to the existence of basic problems (e.g. k-SAT and k-XORSAT) which are expected to have similar average case behaviour in many ways, but completely different worst case behaviour. (I guess this difficulty was well known to the experts, but it is probably good to make it more explicit.)

There's been a bit of back-and-forth on whether "too much time" is being spent on perusing this paper, and of course people are entitled to their own opinions on this matter.

My take is slightly different. While none of the "revelations" listed above are new to people in the relevant areas, they're certainly new to me, and given how unlikely it is that I'd be able to pigeon-hole the relevant experts one-on-one to explain these ideas to me, the discussion is serving an important purpose. In effect, it has created a virtual workshop with even more discussion than a real one !

We've spent a lot of time over the past few years bemoaning the fragmentation of theoretical computer science, and have spent many blog-hours arguing over how to make STOC a destination conference for theoreticians again. But what the discussion over this paper has done is bring together experts in very different areas to talk with each other and exchange ideas, results and the crucial "mental frameworks" that are usually impossible to share except in oral discussions.

And that, in my view, is a net plus regardless of what happens with the proof.

Tuesday, August 10, 2010

A 'polymath' home for analysis of the Deolalikar proof

In collaboration with the Polymath project (and many thanks to Terry Tao for making this happen, and putting in lots of editing work in short order), there's now a new wiki to collect various threads of the discussion on the Deolalikar P != NP paper (which has been updated). With a lot of help from Terry, I've seeded the wiki with the initial discussion points that have come up. Please continue to edit this page as new ideas come up, or there's other discussion of the paper.

This wiki replaces the Google doc that I had originally set up: it's a lot easier to edit, and Google was struggling under the weight of all the people trying to view the file.

The Lipton/Regan posts will continue to host the main research threads. If you are so inclined, you can use this post for "meta"-discussions as is customary in polymath land. Content will be transferred over to the wiki periodically, and more discussion threads can be forked if things start getting unwieldy (the first Lipton post is pushing 117 comments already).

Monday, August 09, 2010

On the Deolalikar proof: Crowdsourcing the discussion ?

Update: the google doc has been replaced by a wiki. Please read this post.

By now everyone knows about the new proposed proof that P $\ne$ NP by Vinay Deolalikar. Better minds than mine are on the case right now, and little snippets of questions are coming out in comments (especially on Dick Lipton's original post).

It's a little hard to keep track of all the discussions, so I thought I'd try a crowd-sourcing experiment. At this link is a google doc that contains some of the initial discussion topics being raised around the proof. It's crudely organized, with a section for each major set of issues, and paragraphs for each person's comment, with sourcing via URL.

It's editable, so here's the idea. As you see issues get raised and resolved, enter that into the doc. The following rules apply:

No discussions: this should be comment/resolution only !
No personal, or nontechnical commentary. This is purely for discussion of the mathematics involved, and nothing else
Add references (urls, or whatever) whenever possible, and quote all context so as not to cause miscommunication.

The goal is to collect the discussions in one place. Ideally, this would be some kind of polymath setup, and maybe it can be converted over to that (if someone does set it up that way, I'll add a link here).

Let's see if this works !

Thursday, July 08, 2010

TheoryOverflow: An Update

Now that I'm done with SODA (any word on number of actual submissions?), I thought I'd do a brief update on the proposed theoryCS Q&A site (currently laboring under the inelegant name of TheoryOverflow). I've had various questions from people, and have wondered about others. So rather than force you all to trawl meta.stackoverflow.com for answers, I thought I'd dish them out here.

I want my TO site ! When is it coming ?

The site is going through what is called a 'commit' stage. The StackExchange overlords (the people providing the service) have in their infinite wisdom decided that the success of such sites is predicated on people committing (with their names and email) to participating, and so we're slowly building up enough commit mojo to get into a trial stage
But we have more commits than the average attendance at STOC !
Ok. firstly, that's rude ! Secondly, there's a complicated formula by which the system determines whether we have enough commits to move into a beta stage. it's also being throttled right now at 90% of its true value because the software for making the sites is still being tested on the few sites that are already in beta.
At this rate (248 commits, 27% progress), we'd need 1000 commits to get to beta. That's larger than the attendance at STOC, FOCS and SODA put together !

Say after me three times: theory conferences are all fine, and we don't have to do anything to change anything about them. Feeling better now ? So the process by which commit progress is calculated is very complicated, involving your reputation on related sites, the number of "badges" you've earned, how friendly you are with Lady Gaga, and other such mysterious parameters. So while there's some correlation with the number of commits, high-reputation users on related sites can bump up the progress meter, under the rationale that people used to Q&A sites are more likely to help this site get off the ground.
But my mathoverflow rep doesn't appear to count ?
No it doesn't. there appear to be technical complications involved in this, because MO used an older version of the software. I personally question this approach to the commit phase, but it's not my software.
This is silly: why can't we all just go over to mathoverflow ?

This is a tricky point. The MO folks have been more than welcoming to theoryCS, and they've explicitly said that they'd be happy to have us over at their site. While I personally would not mind at all if we just started using MO more regularly, I feel that we might get drowned out in the much larger universe of MO, and if we can get our own site, it might be easier to navigate and find interesting questions/answers. But it's always an option.
We're computer scientists, aren't we ? why can't we roll our own ?

Um, well, ... we're THEORETICAL computer scientists, so theoretically we could. Oh wait, you actually want us to do it now ? how annoyingly practical !

More seriously, there is actually an open source system called OSQA that's powering the new machine learning Q&A site called metaoptimize. I didn't hear about this till a few days ago, but it's definitely something that we can consider if the theoryoverflow site appears to stall.
So what now ?

For now, we (that means YOU!) continue to drum up support and commits for theoryoverflow. In terms of average number of commits/day, we're still doing quite well, and the StackExchange folks have indicated that they're likely to change the formula for commit progress based on how the sites already in beta are doing. If a few more weeks go by and things look stalled, we can revisit the matter.

Saturday, July 03, 2010

Tools every modern conference needs

(while I procrastinate on my SODA edits)

Crowdvine: an all-encompassing social network solution for conferences that includes schedule planning, networking, activity monitors (who's coming to my talk) etc
A paper discussion site (could be within crowdvine, or even something like Mark Reid's ICML discussion site)
A good paper submission/review manager like HotCRP (not CMT !)
Videolectures.net
Facebook/twitter/blog for official announcements.
At the very least, a website with all the papers for download. If necessary,zip or torrent links for downloading proceedings (especially if videos are involved).

And no, we did none of these for SoCG 2010. But a guy can dream, no ? While I'd be shocked to see crowdvine at any theory conference (price, culture, you name it), I think videolectures.net would be a valuable resource, and many of the rest require time but are otherwise free.

Monday, June 28, 2010

TheoryCS discussion site: A call for help.

Many of you are familiar with Mathoverflow.net (MO), the discussion site where (mostly) professional mathematicians come to discuss research problems, get questions answered by experts in the area, and have very enlightening discussions on a host of topics in mathematics.

The conversation is at a very high level. The moderators (and the community at large) do an excellent job of pruning out the posters fishing for homework answers, the vague questions with answers that any google search could return, and off topic questions that are better served by a more amateur-centric forum. What remains are questions that are at least at the advanced graduate student level, and that frequently spur open-ended discussion among the advanced researchers in the area.

It's a dream site: it has much less clutter than sci.math (or comp.theory for that matter), and because of modern tagging and filtering technology, is a lot easier to navigate than the old Usenet.

While theoryCS questions are welcome at MO, they sometimes get neglected for a lack of critical mass in the audience. Many in the theoryCS community participate in MO, but it would really be ideal to have our own discussion site that's structured along the same lines with the same intent:

to be a site for professional researchers to come together and ask/answer questions about research.

If you're a student, this could be an invaluable resource to learn about different aspects of TCS, and get to know researchers in the area. If you're an active researcher, this is a great place to post queries about topics that you might not be familiar with but need in your own work.

We're in the process of creating such a site, courtesy of Anand Kulkarni (aka polybot). Stack Exchange is the software system and the organization that maintains discussion sites like this, and has built a system to create, discuss and vet sites to ensure that they have enough critical mass to sustain themselves.

The theoretical computer science site is here. The way it works is as follows:

We start with a 'define' phase with prototypical questions that are right and not-right for the site. This has been completed, with a sufficient number of questions that people have voted for.
We move on to the 'commit' phase, where we are now. Here, we need commitments from people (with their actual names attached) that they will participate in the site once it goes into beta testing. We have a number of commitments so far, but we need much more in order to move to phase 3, which is
Beta testing, where the site goes active and we see if we can sustain it with questions and discussions for a while. If so,
The site gets created.

This is a great opportunity to create a site that can truly be our own, and that would be a go-to location for all aspects of theoretical computer science, whether it be algorithms, complexity theory, quantum computing, game theory, programming language theory, logic, or whatever. Tagging and filtering means you don't have to get swamped in a sea of posts on topics you don't care about, and if you've used MO, you know how easy it is to have the site only display topics that are relevant to you.

What should you do ? Go to this site, and commit to participating in the beta. Committing costs nothing - it's literally one click after you authenticate. Then all you have to do is spread the word so we get enough people to move to the beta phase.

If you're skeptical about this (or maybe have wasted too much time on comp.theory in years gone by knocking down P vs NP cranks), do go to MO and see what a theoryCS site could become.

And if you're a theoryCS blogger with your own following, please spread the word ! you all know who you are :)

p.s there's some confusion about exactly how many people are needed to get to the next phase. The answer is that it's not based on numbers, but based on the reputation of the people committing (as measured by their activity on related sites - but not MO sadly). Most folks in our community are unlikely to have large reputation in the related (mostly programming) websites, so we'll need a good number of people (probably in the low 100s) to get there. Hence this appeal (note that if you commit and then use the 'share' link to get someone else to sign on, your "reputation" increases, and that improves the overall progress of the site in a cascading effect)

Metrics via pullbacks

One of the things I end up having to do a lot of is ponder distance metrics. Not the nicely behaved norm-induced ones, but more bizarre entities. metrics over shapes, metrics on lattices, metrics on distributions, and the like.

There are many constructions for building metric structures on spaces for which it's not clear how to do so. One of the neatest methods is via pullbacks, exploiting the algebraic and continuous duals for vector spaces.

The basic idea is as follows: You want to build a metric on some (usually ill-formed) vector space V. Fortunately for you, the space V* of linear functionals over V is better behaved. Even better, you can define a norm on V*. This allows you to do a little magic.

Define a function || || on V as ||x|| = sup f(v), over all f in V*, where ||f|| <= 1. This is of course the "dual" norm. It can be shown that it indeed satisfies the properties of a norm. Once you have a norm, then d(x,y) = ||x -y ||. Voila !

These types of constructions are particularly useful when dealing with distributions (the Schwarz kind) and their geometric generalizations, the currents (which are a measure-theoretic way of defining surfaces). Distributions can be nasty - you can only interact with them via their linear functionals (the space of smooth functions with compact support). But this construction allows you to put nice metric structures on them.

Some examples of metrics arising in this manner:

The l_1 distance between probability measures (or the total variation distance)
The earthmover distance between probability measures (this is quite nifty)
The current distance (between measures, or between currents).

Sunday, June 27, 2010

And for some Sunday entertainment

(dare I say XKCD-style) Flowcharts for the life of the tenured and untenured professor. A collaborative School of Computing effort between my colleagues John Regehr, Matthew Might and myself (who says professors can't collaborate inside a department!).

Incidentally, we also make up the vast majority of our department's blogging presence.

Friday, June 25, 2010

Bad Research As Spam

Jon Katz and Matt Welsh have both written recently about the problems of dealing with crap papers, mainly the pain of having to review them. In an unrelated event, I got into a discussion with some colleagues about the problems of "crap research', and ended up formulating a theory: viz,

bad research is like spam email

in the sense that

There seems to be no way to stop it, and many economic incentives to continue it
You can't stop it by diktat
The only way to deal with it appears to be effective filtering. Like spam, bad research has less of an effect when it gains no attention.

There are other similarities:

we can block spam by filtering certain domains. We also tend to ignore certain kinds of conferences
we can block spam by blocking certain email addresses. We also might ignore certain researchers, or at least downweight their work after a series of bad experiences.
More explicit spam blocking policies create a false-negative problem. False-negatives are also a big problem in research.

But this analogy also suggests that we shouldn't be designing strategies to eliminate bad research. We should be designing better methods to focus attention on good research, via better filtering and highlighting mechanisms (social, authoritative or otherwise).

Personally, I think bad research is less of a problem than lots of cargo-cult research, where it looks a lot like good research is being done, but when you look closely, you realize that nothing of value has been added to the universe. Sadly, a lot of funded research is also like this.

PS: False negatives are a highly underrated problem in research. I think we vastly overestimate our ability to judge what kinds of research will have lasting value and what potential impact a paper can have. So it's important to err on the side of being more generous to papers, rather than less.

Wednesday, June 23, 2010

The Future of STOC

Lance has now created a blog, 'The future of STOC', where he's been posting responses from people who were asked to comment on the original proposal for modifying STOC. The responses are almost unanimously negative and make for interesting reading.

My response is linked there as a PDF: I thought I'd place the main text here, just to solicit some comments. After writing this, I've done some digging for data (which will hopefully appear in a post shortly) that brings up an interesting angle to the 'accept more papers' discussion.

My response: This is a terrible way of solving the problem, because...

There are better solutions!

It is puzzling to me that we're jumping straight to an alien (to us) conference model, when there are proven hybrid conference models that exist within the larger (conference-driven) computer science community. ICALP (theory), SIGMOD, VLDB and ICDE (the top three database conferences), ICML and NIPS (the top machine learning conferences), KDD and SDM (the main data mining conferences), SOSP (the main OS conference), and (to a degree) SIGCOMM and INFOCOMM (networking) all have the following model:

A main research ``track'', with peer reviewed papers. Optionally, an ``industrial'' track for more applied work.
A set of workshops, for which topics are solicited via a call for proposals.
A set of tutorials, for which again topics are solicited via 5-page abstracts and reviewed. Panel discussions and demos (optionally)

The conference itself is a 5-day event, with research and industrial tracks, panels, and tutorials, all blended together (workshops either bracket the conference, or are interleaved). I've been to many of the conferences listed above, and I can assert that I've met many people not originally of the community that attend because of these satellite events.

Other variants include limiting the number of actual talks, while still including all accepted papers in the proceedings (the remainder of the papers are presented as posters). This opens up more time during the day for satellite events as well.

Note: I've begun noticing more tutorial events at STOC (STOC 2010 has a full day of these). This definitely is progress, although I believe that workshops draw more attendance. I also think it's important to solicit these from the community, rather than having the PC decide topics. Doing so both increases participation and increases the sense that the community is part of the whole conference.

Math meetings are fractured

I've never attended an AMS meeting myself, but from all accounts (and I have heard MANY), they are highly fractured. The meetings draw attendees from the spectrum of mathematical areas, but they all essentially group together in tiny miniconferences - I've heard numerous stories of multiple rooms in which the only people listening to a talk are the five other speakers and a few graduate students. This is not the way I want to see our flagship theory conference evolve.

People won't come

For better or for worse, we're used to the 'publish-attend-present' model of CS conferences, rather than the 'meet-greet-discuss' model of math/science conferences. I suspect that many people will not come to a STOC in which there is no proceedings and no real publication attached to the presentation (at least none that can go on a CV). There's nothing wrong with the model per se, but it's not something our community is comfortable with, and since other theory conferences won't go this route, I don't see how we'll ever get comfortable with it.

Bottom Line

I applaud the idea of shaking things up in regard to the format for STOC. I just feel very strongly that we should learn from other models that exist within the computer science community, rather than making a radical shift towards a
model that's part of a very different framework for how the publishing/dissemination process works. I am unconvinced that the proposed model would solve the problem of attendance, and it has a good chance of making STOC entirely irrelevant.

Social theory...

as in, doing theory socially. Three points of interest:

Luca Trevisan is soliciting ideas for building a schedule/recommendation engine for FOCS using collaborative filtering. He's on a short time frame, so you need to know what you're doing, but I dare say there's an excellent ALENEX submission waiting for anyone who has the time to work on this.
Anand Kulkarni is proposing a theory overflow site, much like Math Overflow (which many of you inhabit already). I've been relatively happy with MO, and they're quite friendly towards algorithms folks (although sometimes a little confused about the difference between theoryCS and programming). But I do often tire of wading through pages and pages of unrelated questions to get to interesting ones.

I don't know if there's enough global support for theory overflow, but I do know that MO has been a fantastic resource for research-level mathematics, and with enough participation, theory overflow could get there too. If you don't know what I'm talking about, go to mathoverflow.net. If you think that it's a waste of time, I'll mention that among the ACTIVE participants there are Terry Tao, Timothy Gowers, Richard Stanley and Bill Johnson (as in Johnson and Lindenstrauss)
Mark Reid has built a discussion site for ICML 2010 (ICML has been doing this for a few years now). Each paper at the conference gets a page, and anyone can post comments on the page. Authors can opt to get email whenever someone posts a comment, and can in this way interact with discussants. I wonder if something like this might soon become a de facto part of all conferences.

Tuesday, June 22, 2010

Case Study in Large Theory Conferences: ICALP

Luca Aceto had posted a comment on my last post about STOC, describing his experiences running ICALP as a large theory conference. I thought it was fascinating, and requested that he repost a longer version on his blog. That post is here, and I strongly encourage anyone who's been involved in this discussion to go and read it... now....

I wanted to highlight a few snippets from it that I feel reinforce a number of points that I've been arguing.

ICALP is a three-track conference, and has been so since 2005, though typically only two tracks are running in parallel at each point in time. At ICALP 2008, in addition we had 12 satellite events, including the DYNAMO training school for doctoral students

Note that ICALP had an attendance range of about 500 - where we'd like STOC to be. It fits in with the pattern I was describing: more satellite events appears to correlate with larger attendance.

As an aside, ICALP had Peter Winkler do a masterclass on math puzzles. Frankly, if we could just hire Peter Winkler and Persi Diaconis to do lectures at every STOC, our numbers would go into the stratosphere ;)

The workshops were held the day before ICALP or during the week-end following it. They were selected by the ICALP organizers amongst a fairly large number of proposals that we received in response to a call for workshops, based on their perceived scientific quality and on their potential interest to the ICALP community.

I've said this before, but I do think that if we go the route of having workshops/tutorials, the best way to do it is how other conferences do it: have a workshops chair solicit proposals, and decide from amongst them. The workshop organizers then take care of the work of getting speakers. It will ease the burden a lot.

I firmly believe that the task of organizing a conference like ICALP should be shared amongst several people. This certainly worked for us and helped us work more cheerfully, and overcome personal problems, mishaps and periods of crisis and panic that arose during the year before the conference took place

Very true. Again, most conferences that have lots of activities have a large organizing group, with proceedings chairs, arrangements chairs, workshop chairs, tutorial chairs, and so on. Apart from the fact that people's CVs get nicely bloated with service activities, more participation at the top level can actually help with overall attendance, as well as alleviating many of Michael's concerns (although technically he was more concerned with colocation, which is an idea I like, but does take more logistical coordination).

Monday, June 21, 2010

On acceptance rates and flagship conferences

There's been a lot of back and forth on ways of increasing attendance at STOC, and in our wonderful theory way, all of this has happened in a universe unencumbered by the presentation of actual data.

I thought I'd dig up statistics on what exactly goes on at a number of major conferences in different areas in computer science. My idea was to take some of the major areas in the field, identify their flagship conference (or one of two as the case may be), and compile statistics on acceptance rates, attendance, and general conference activities.

The areas I considered (with main conference in parentheses) were

databases (SIGMOD)
machine learning (ICML)
operating systems (SOSP)
networking (SIGCOMM)
architecture (ISCA)
graphics (SIGGRAPH)

and the results I got were interesting (all data that I compiled can be found in this spreadsheet: feel free to add other areas, or update numbers, as you see fit). Where I could, I tried to get a sense of attendance/acceptance rates from either asking people or looking at published numbers for recent years: the ACM DL has acceptance rates for many of the above. Information on conference activities were taken from the most recent year I could get data for (usually 2010 or 2009). The main points:

All of the listed conferences had attendance in the 500-600 range (except ISCA with average attendance of 400, and SIGGRAPH with 2000+ in the research side). So they are definitely conferences with attendance that STOC would like to mimic.
Acceptance rates varied, but most were below 20% (ICML being the exception at 25%). STOC is at 28% or so
Number of papers accepted varied widely (23 for SOSP, 150 for ICML). I find this particularly interesting: it would seem that attendance correlates more with the perception of being 'flagship' than the actual number of papers accepted.
Most conferences had long lists of colocated workshops. The smallest number was SIGCOMM last year with 5, and others had many more. STOC had none.
Tutorials were variable: some places had many (SIGGRAPH had 27) and some had none. STOC had 3.
With the exception of ISCA last year, all the conferences had significant poster sessions, either consisting of all papers accepted, or as a separate track with many posters. STOC had none.
The conferences all had other activities: demos, industrial tracks, works in progress or other such things (ISCA being the exception). STOC had none.
Durations varied between 4 and 6 days (including the initial day). Most had 5. STOC is 4.

To me, there are two things that stand out from this.

The number of papers accepted does not appear to make a difference to the attendance. SOSP happens once every two years, and accepts 23-25 papers, and gets 500 attendees !! ICML gets a similar number of attendees with 150 papers accepted each year.
There are a TON of activities at these conferences. Indeed, I think ICALP and ESA match them in terms of level of activity, but certainly not STOC. I've been a proponent of satellite events around a conference to increase attendance, and the STOC/EC/CCC colocation does seem to have helped. I'm also intrigued by the idea of colocating SoCG with STOC.

You may draw your own conclusions...

p.s for the legions of readers who will start protesting that these communities are much larger than the theory community, I will merely point out that almost no one in this discussion thinks that the theory community is 300 strong: the question is more about getting the rather large theory community to show up in force for STOC.

Update: Michael Mitzenmacher has a post up listing specific logistical issues that come up with expanding the set of activities at a conference. He points out that if we decide to go to multiple satellite events (whether as separate events or whatever), we'll have to adjust to a much greater degree of organizational commitment up front, as well no small amount of 'attitude adjustment'. For anyone who's ever attended a business meeting, this is a scary thought :)

Saturday, June 19, 2010

The Shape of Shape Analysis Research, Part III

(a brief series on shape analysis: for earlier episodes, click here)

Shape matching research in computational geometry is fundamentally distance-based. In other words, we start with a distance function, and then design algorithms to compute it, or minimize it under transformations, or approximate it, and so on.

There's an important problem with this point of view. While computing the distance between two shapes is an important tool in shape analysis, it's not the only problem. Other equally important problems include:

Finding a shape similar to a query shape
Matching pieces of shapes together
Organizing shapes into groups (i.e clustering)

And so the problem with the distance-based viewpoint is that all you get at the end is an abstract metric space. You can compute d(x,y) in an appropriate amount of time (maybe), but you lack all the additional structure needed to solve these other problems efficiently. With our modern knowledge of metric embeddings, it's always possible to ask if these distances can be embedded in a more tractable space, but it turns out for measures of interest (Hausdorff, Frechet, earthmover), this cannot be done without incurring huge errors.

The idea of shape spaces turns this process around. Rather than starting with the distance, and trying to find a space to embed it in, shape-space based methods start with a mapping that takes a shape to a single point in a (usually curved) space, and use an induced metric (usually some kind of geodesic) as the distance.

By at least one unsourced account, this view of shape dates back to Riemann, but the modern formulation of this approach started with David Kendall, in the 70s. His idea was extremely elegant.

Consider a collection of closed simply connected regions of the plane (the shapes), each shape described by k points on its boundary. Each of these points can be described by the two coordinates (x,y), which we will write as the complex number x+iy. By a shifting transformation, we can ensure that the centroid of each shape lies at the origin. This loses one (complex) degree of freedom, yielding a k-1 dimensional complex vector.

Next, consider what it means to rotate the shape around the origin. In the complex plane, this corresponds to multiplying by the complex number z = exp(i theta). Doing the appropriate projective transformation, this means that we can identify a shape with a single point in k-2 dimensional complex projective space.
The distance between two shapes is now defined as the geodesic distance between two points in this space.

There are a few important points to note here:

Each shape of k points is mapped to a single point in a k-2 dimensional space.
All shapes are assumed to have the same number of points, which correspond across shapes.
The space is constructed by quotienting the original representation (the k-dimensional complex vector) by the special orthogonal group.

This last point is particularly crucial: the invariance under transformations is folded directly into the representation, rather than being something to "solve" via minimization.

The general program outlined by Kendall (map shapes to points on a manifold quotiented by a suitable set of transformations) has led to many other constructions, among the more notable being Bookstein's shape space and the Michor-Mumford representation for planar closed curves invariant under diffeomorphisms (which bears a strong resemblance to a summed variant of the Frechet distance). These methods have (for reasons unknown to me) taken up residence primarily in the computer vision community.

A Critique.

There is much to like about the shape space approach to shape analysis. Fundamentally, by embedding shapes in a space with structure, it gives us both a distance measure and a geometry to play with, and this is invaluable. However, there are serious limitations to the ideas developed thus far.

Computation: It's all very well to come up with a mathematically elegant formulation of a distance as a geodesic, but it's a lot harder to actually compute these distances. In practice, researchers often resort to heuristics with no guarantees beyond local convergence. To me, this is like building a beautiful mansion in a pit of mud: it's hard to get in and out with a lot of dirt and pain.
Scalability: the mathematical complexity also makes it harder to do scalable computations on shapes.
Global vs local features: I'll have more to say about this later, but these approaches (generally speaking) construct a global signature for a shape, which limits one's ability to do partial matching.
Correspondences: The Kendall method at least requires explicit correspondences between points in each shape. Finding correspondences is one of the most annoying parts of shape analysis (and affect most methods for comparing shapes).

Next: We examine the problem of hearing shape, or how the Laplacian starts to figure in.

Pages