The Geomblog: 02/01/2009

Thursday, February 26, 2009

Fighting over shiny toys

A recent article in the Guardian extols the wonder of The Algorithm:

And since computers are increasingly dominant in our lives, algorithms are increasingly important - and nowhere is this more apparent than on the internet. In the online world, mathematical analysis isn't just important: the algorithm is king. Everywhere you turn online, companies are using algorithms in their quest for success. From Google's search results and Apple's music recommendations to Amazon telling you that "customers who bought this item also bought ... " algorithms are at work.

The article itself is pretty tame, a kind of knurd version of Bernard's article. What's funny is that the last line in the article, 'Mathematicians rule' got some people into a tizzy.

Two letters appeared in the Guardian following this article:

Bobbie Johnson (Go figure ..., 23 February) highlights the ever-increasing role the mathematics of algorithms plays in our daily lives, including Google's page ranking. "Mathematicians rule!" concludes Johnson. So a reader inspired by your article may seek to contact an expert in algorithms in the mathematics department of their local university. In this, the article will have misled them, as expertise in this area is to be found predominantly in departments of computer science and informatics.

and

Bobbie Johnson describes algorithms as "jealously guarded mathematical recipes that increasingly dictate how we lead our lives". What he's actually describing is operational research - the discipline of applying appropriate, often advanced, analytical methods to help make better decisions. Executives in every kind of organisation - from two-person start-ups to FTSE 100 leaders - are using OR to structure their problems, unlock the value of their data, model complex problems and make better decisions with less risk and better outcomes.

Of course, the first letter comes from the faculty of the CS department at Edinburgh, and the second from a member of the Operations Research Society. I wait for all the data mining and machine learning enthusiasts to start complaining next.

Monday, February 23, 2009

Wordles for STOC/FOCS/SODA/SoCG

A wordle is a visualization of the words in a text, organized to give more frequent words higher priority. It's a cute way of illustrating the repeated concepts in a text.

My student Parasaran Raman made wordles for the paper titles for FOCS 2008 and STOC/SODA/SoCG 2009. You can click on each image to get a larger view. Draw your own conclusions :)

FOCS 2008:

STOC 2009:

SODA 2009:

SoCG 2009:

On the best use of stimulus research dollars

John Langford has a post up advocating that research dollars ought to go to industrial research labs. My first temptation was to write a counter-post advocating that research dollars ought to go to university researchers in less-than-well-known mountainous locations, especially pre-tenure refugees from industrial research labs hungry for the money and willing to do lots of work.

But more seriously, I think there's plenty wrong with John's argument. He argues that the Bell Labs model of the 1900s and earlier has the main components of a successful research agenda: access to cutting edge problems, free time for researchers, and concentration, and infers from this that the best use of government funding is to fund basic research at companies that have such labs (MS, ATT, Lucent, IBM, Yahoo, Google are cited).

Along the way, he throws out various statements that I have problems with:

Some research universities manage to achieve at least access and concentration to some extent, but hidden difficulties exist. For example, professors often don't work with other professors, because they are both too busy with students and they must make a case for tenure based on work which is unambiguously their own.

This is partly true: there's less collaboration in universities than in labs, but much of the collaboration in research labs is also by necessity: to get some things done takes a number of people, and you don't really have access to a ready supply of ~~slaves~~ students. Collaborative research is not by definition better.

at least research at national labs have had relatively little impact on newer fields such as computer science.

National labs play an important role on lots of large-scale visualization work (I know this because I'm at one of tthe best viz places in the country, and they have extensive collaborations with national labs). National labs compete like universities for research money, and often have the inside track on funding from places like the DoE. Their sweet spot is the kind of large-scale infrastructure work that's hard to do at universities or at industrial labs.

Some people might think that basic research done at a university is inherently more desirable than the same in industry. I don't see any reason for this. For example, it seems that patentable research is about as likely to be patented at a university as elsewhere, and hence equally restricted for public use over the duration of a patent. Other people might think that basic research only really happens at universities or national labs, but that simply doesn't agree with history.

This is a strawman: I don't know who 'some people' is, but I think any reasonable position would argue that the kind of research is different: someone once told me that the ideal industrial project involves 3-5 people: any larger or smaller is best done at a university. Whether the research is basic or not depends on the work: I don't know of many industrial labs that support research in complexity theory (except MSR) which is arguably basic research, but there's very fundamental research done at many labs in areas like auctions, ad modelling, large-scale computing, and so on.

But quibbling over statements aside, I think that the best use for stimulus money is neither universities (though I'm very grateful for the $3B) or industrial labs. I think the best use is in places where its already going: the so-called "shovel-ready" projects in green tech/renewable energy. The kind of funding that leads to direct economic impact is not going to come out of either universities (which naturally take a longer time line) or industrial research labs (that have lots of sloth and bureaucracy). It's going to come from VC-type funding for energetic startups that actually make things happen. Yes, there'll be research, but presumably the projects being funded will be well beyond the research stage and ready to make things happen now, or in the next few years. That is after all the point of the stimulus.

The budget will be coming out soon, and I hope that attention is paid to a longer-term reversal of the depredations in science funding in the US. But the stimulus package is about the present and the short term.

Other notes:
* In the comments, Hal argues that education is an important mandate of the NSF, and that's why resources get channelled to universities. I'd also add that I don't see why taxpayers should fund a corporation's bottom-line: money for Yahoo helps Yahoo, whereas money to fund students helps generates more expertise.

John says: "In economic terms, these companies have for reasons of their own decided to provide a public good. As long as we are interested, as a nation, or as a civilization, in subsidizing this public good, it is desirable to do this as efficiently as possible.". Permit me to snigger. Bell Labs had a monopoly on the telephone network for eons, and having a research arm was good PR for them. Once the monopoly collapsed, so did the dedication to provide a public good. Having worked at AT&T these many years, I am deeply grateful for the opportunities I had there, but there was a clear focus on research that helps the company bottom line. Even the much vaunted Google Research makes no secret of its focus on company-specific projects (the 20% rule implies an 80%!) (disclaimer: I occasionally consult for Google).

Post your comments here, or over at the original thread.

Wednesday, February 18, 2009

Theory "vs" Practice

Today's Wild Side science column from the NYT (guest blogged by Stephen Quake) is an interesting Roscharch test for which side you fall on in the 'theory-practice' divide. The article in fact argues a very valid point: that great research is often done by moving smoothly between theoretical study and practical applications, rather than privileging one over the other. Along the way, he cites Gauss, Kelvin, Archimedes and others as examples of people doing solid theoretical work inspired by, and inspiring, more practical considerations.

I mention the Roscharch test because (like with political commentary) one often tends to read bias or skew into neutral statements. For example, practitioners will find much to be happy about in this opening:

The snobbish idea that pure science is in some way superior to applied science dates to antiquity

and theoreticians will be consoled by:

The stereotyped view is that the applied scientists control the lion’s share of funding, while the basic scientists control the most prestigious journals and prizes.

but if you can get beyond your reflexes, it's a fair article about the need to think broadly about the "impact" of your work both theoretically and practically, and how this can lead to solid research on both counts.

Postdoc opportunity

Kirk Pruhs writes in with another postdoc position. There's no immediate deadline for applications, but the subject of the postdoc relates to the previously mentioned NSF workshop on power management, now (re)rescheduled for Apr 9-10:

I want to investigate algorithmic issues for optimization problems related to power management. [..]

But I am looking to broaden the range of power management problems that I work on. If you are at all interested, I encourage you to attend the NSF Workshop on the Science of Power Management that I am organizing in DC on April 9-10. The workshop participants will consist of leaders in the practice and science of power management, and the purpose of the workshop is to provoke discussion among experts, identify key research directions, and report key findings to NSF. I have funds to support travel to the workshop. [..]

The research will involve searching for algorithmically interesting problems in this area, and solving these problems. It is certainly not necessary for you have any research experience related to power management. What is necessary is that you [are] the type of person that likes to expand their interests into new, exciting, areas of research.

Tuesday, February 17, 2009

Ketan Mulmuley at the Center for Intractability

The Center for Intractability recently hosted Ketan Mumuley for a 3-part talk series on his attack on P vs NP via geometric complexity theory. The videos are now online here.

And let me just add here that I think it's fantastic that the center posts video for all the talks. It takes some work to get videos produced for web delivery, and it's so much nicer than reading a paper (or 10).

Maybe I need to reconsider this open access biz

I was lukewarm to Joachim's proposal for an open access journal, but I'm changing my mind, after seeing the latest shenanigans being perpetrated by the scientific publishers in collusion with Congress. There's a new bill making its way through the house that would overturn the NIH open access policy (all papers should be placed on a public site within 12 months of publication), as well prohibit the government from obtaining a license to post such works on the internet.

Time to call your congressman, or use the (in)famous Obama outreach program.

Sunday, February 15, 2009

A new open access CG journal

The last few years has seen many attempts by researchers to break free from the shackles of (commercial) journal publishers. There's the whole-sale exodus that produced the ACM Transactions on Algorithms, as well as other journals, and technology aided creation of open-access free journals like the Theory of Computing. There's also a movement to create an open access computational linguistics journal, spearheaded by my colleague Hal Daume here at the U. of Utah.

Joachim Gudmundsson and Pat Morin have been investigating the feasibility of making such a journal for Comp. Geom, motivated by costs, and copyright issues with current journals. Here are posts one, two and three on the topic.

They've worked out most of the logistical issues involved in creating such a journal, and are now trying to reach out to the community to see what kind of interest there is. After all, the main currency of a journal is its reputation, and that comes from community participation (and then perception). So if you have any opinion on the matter, hop over to Dense Outliers, take the poll and post a comment (don't post comments here).

My personal view: I think open access is a great idea in principle, but I'm not seeing a pressing need within CG itself for such a journal at this point in time. (Disclaimer: I'm involved with the International Journal for Comp. Geom and Applications).

Sunday, February 08, 2009

Making sausage

No, not that kind.

By now, many of you have probably heard of the massively collaborative math experiment being conducted by Timothy Gowers on his blog (with offshoots on Terry Tao's blog). The idea is to mount a serious attack on a conjecture in combinatorics called the density Hales-Jewett conjecture (go to Gowers' site for more details).

Michael Nielsen points out something that I had been thinking about while perusing the initial thread: this is an EXCELLENT way to show students how research gets done. I remember a while back that Sean Carroll from Cosmic Variance had done a post explaining how one of his papers got written, but the post-facto description lacked the immediacy and the messiness of a usual research process, and probably (just even through faulty memories) even missed out on some the paths not taken in the course of the research.

Seeing research done "live" as it were, by professional high-caliber mathematicians, is as exciting as watching live professional sports, and is even better in the sense that you see the the false starts, the high level strategizing and plans of attack, the multitude of possible ideas that get formed, and even the growth of more stable, promising lines of attack on related problems.

One of the things I'm pondering right now is the best way to show students how research is done, and this is a great example to illustrate the messy, convoluted, and yet highly sophisticated ways in which experts ply their trade.

The Geomblog

Pages