Thursday, February 26, 2009

Fighting over shiny toys

A recent article in the Guardian extols the wonder of The Algorithm:
And since computers are increasingly dominant in our lives, algorithms are increasingly important - and nowhere is this more apparent than on the internet. In the online world, mathematical analysis isn't just important: the algorithm is king. Everywhere you turn online, companies are using algorithms in their quest for success. From Google's search results and Apple's music recommendations to Amazon telling you that "customers who bought this item also bought ... " algorithms are at work.

The article itself is pretty tame, a kind of knurd version of Bernard's article. What's funny is that the last line in the article, 'Mathematicians rule' got some people into a tizzy.

Two letters appeared in the Guardian following this article:
Bobbie Johnson (Go figure ..., 23 February) highlights the ever-increasing role the mathematics of algorithms plays in our daily lives, including Google's page ranking. "Mathematicians rule!" concludes Johnson. So a reader inspired by your article may seek to contact an expert in algorithms in the mathematics department of their local university. In this, the article will have misled them, as expertise in this area is to be found predominantly in departments of computer science and informatics.

and
Bobbie Johnson describes algorithms as "jealously guarded mathematical recipes that increasingly dictate how we lead our lives". What he's actually describing is operational research - the discipline of applying appropriate, often advanced, analytical methods to help make better decisions. Executives in every kind of organisation - from two-person start-ups to FTSE 100 leaders - are using OR to structure their problems, unlock the value of their data, model complex problems and make better decisions with less risk and better outcomes.

Of course, the first letter comes from the faculty of the CS department at Edinburgh, and the second from a member of the Operations Research Society. I wait for all the data mining and machine learning enthusiasts to start complaining next.

Monday, February 23, 2009

Wordles for STOC/FOCS/SODA/SoCG

A wordle is a visualization of the words in a text, organized to give more frequent words higher priority. It's a cute way of illustrating the repeated concepts in a text.

My student Parasaran Raman made wordles for the paper titles for FOCS 2008 and STOC/SODA/SoCG 2009. You can click on each image to get a larger view. Draw your own conclusions :)

FOCS 2008:


STOC 2009:


SODA 2009:


SoCG 2009:

On the best use of stimulus research dollars

John Langford has a post up advocating that research dollars ought to go to industrial research labs. My first temptation was to write a counter-post advocating that research dollars ought to go to university researchers in less-than-well-known mountainous locations, especially pre-tenure refugees from industrial research labs hungry for the money and willing to do lots of work.

But more seriously, I think there's plenty wrong with John's argument. He argues that the Bell Labs model of the 1900s and earlier has the main components of a successful research agenda: access to cutting edge problems, free time for researchers, and concentration, and infers from this that the best use of government funding is to fund basic research at companies that have such labs (MS, ATT, Lucent, IBM, Yahoo, Google are cited).

Along the way, he throws out various statements that I have problems with:

Some research universities manage to achieve at least access and concentration to some extent, but hidden difficulties exist. For example, professors often don't work with other professors, because they are both too busy with students and they must make a case for tenure based on work which is unambiguously their own.

This is partly true: there's less collaboration in universities than in labs, but much of the collaboration in research labs is also by necessity: to get some things done takes a number of people, and you don't really have access to a ready supply of slaves students. Collaborative research is not by definition better.

at least research at national labs have had relatively little impact on newer fields such as computer science.


National labs play an important role on lots of large-scale visualization work (I know this because I'm at one of tthe best viz places in the country, and they have extensive collaborations with national labs). National labs compete like universities for research money, and often have the inside track on funding from places like the DoE. Their sweet spot is the kind of large-scale infrastructure work that's hard to do at universities or at industrial labs.

Some people might think that basic research done at a university is inherently more desirable than the same in industry. I don't see any reason for this. For example, it seems that patentable research is about as likely to be patented at a university as elsewhere, and hence equally restricted for public use over the duration of a patent. Other people might think that basic research only really happens at universities or national labs, but that simply doesn't agree with history.


This is a strawman: I don't know who 'some people' is, but I think any reasonable position would argue that the kind of research is different: someone once told me that the ideal industrial project involves 3-5 people: any larger or smaller is best done at a university. Whether the research is basic or not depends on the work: I don't know of many industrial labs that support research in complexity theory (except MSR) which is arguably basic research, but there's very fundamental research done at many labs in areas like auctions, ad modelling, large-scale computing, and so on.

But quibbling over statements aside, I think that the best use for stimulus money is neither universities (though I'm very grateful for the $3B) or industrial labs. I think the best use is in places where its already going: the so-called "shovel-ready" projects in green tech/renewable energy. The kind of funding that leads to direct economic impact is not going to come out of either universities (which naturally take a longer time line) or industrial research labs (that have lots of sloth and bureaucracy). It's going to come from VC-type funding for energetic startups that actually make things happen. Yes, there'll be research, but presumably the projects being funded will be well beyond the research stage and ready to make things happen now, or in the next few years. That is after all the point of the stimulus.

The budget will be coming out soon, and I hope that attention is paid to a longer-term reversal of the depredations in science funding in the US. But the stimulus package is about the present and the short term.

Other notes:
* In the comments, Hal argues that education is an important mandate of the NSF, and that's why resources get channelled to universities. I'd also add that I don't see why taxpayers should fund a corporation's bottom-line: money for Yahoo helps Yahoo, whereas money to fund students helps generates more expertise.

John says: "In economic terms, these companies have for reasons of their own decided to provide a public good. As long as we are interested, as a nation, or as a civilization, in subsidizing this public good, it is desirable to do this as efficiently as possible.". Permit me to snigger. Bell Labs had a monopoly on the telephone network for eons, and having a research arm was good PR for them. Once the monopoly collapsed, so did the dedication to provide a public good. Having worked at AT&T these many years, I am deeply grateful for the opportunities I had there, but there was a clear focus on research that helps the company bottom line. Even the much vaunted Google Research makes no secret of its focus on company-specific projects (the 20% rule implies an 80%!) (disclaimer: I occasionally consult for Google).

Post your comments here, or over at the original thread.

Wednesday, February 18, 2009

Theory "vs" Practice

Today's Wild Side science column from the NYT (guest blogged by Stephen Quake) is an interesting Roscharch test for which side you fall on in the 'theory-practice' divide. The article in fact argues a very valid point: that great research is often done by moving smoothly between theoretical study and practical applications, rather than privileging one over the other. Along the way, he cites Gauss, Kelvin, Archimedes and others as examples of people doing solid theoretical work inspired by, and inspiring, more practical considerations.

I mention the Roscharch test because (like with political commentary) one often tends to read bias or skew into neutral statements. For example, practitioners will find much to be happy about in this opening:
The snobbish idea that pure science is in some way superior to applied science dates to antiquity

and theoreticians will be consoled by:
The stereotyped view is that the applied scientists control the lion’s share of funding, while the basic scientists control the most prestigious journals and prizes.

but if you can get beyond your reflexes, it's a fair article about the need to think broadly about the "impact" of your work both theoretically and practically, and how this can lead to solid research on both counts.

Postdoc opportunity

Kirk Pruhs writes in with another postdoc position. There's no immediate deadline for applications, but the subject of the postdoc relates to the previously mentioned NSF workshop on power management, now (re)rescheduled for Apr 9-10:

I want to investigate algorithmic issues for optimization problems related to power management. [..]

But I am looking to broaden the range of power management problems that I work on. If you are at all interested, I encourage you to attend the NSF Workshop on the Science of Power Management that I am organizing in DC on April 9-10. The workshop participants will consist of leaders in the practice and science of power management, and the purpose of the workshop is to provoke discussion among experts, identify key research directions, and report key findings to NSF. I have funds to support travel to the workshop. [..]

The research will involve searching for algorithmically interesting problems in this area, and solving these problems. It is certainly not necessary for you have any research experience related to power management. What is necessary is that you [are] the type of person that likes to expand their interests into new, exciting, areas of research.

Tuesday, February 17, 2009

Ketan Mulmuley at the Center for Intractability

The Center for Intractability recently hosted Ketan Mumuley for a 3-part talk series on his attack on P vs NP via geometric complexity theory. The videos are now online here.

And let me just add here that I think it's fantastic that the center posts video for all the talks. It takes some work to get videos produced for web delivery, and it's so much nicer than reading a paper (or 10).

Maybe I need to reconsider this open access biz

I was lukewarm to Joachim's proposal for an open access journal, but I'm changing my mind, after seeing the latest shenanigans being perpetrated by the scientific publishers in collusion with Congress. There's a new bill making its way through the house that would overturn the NIH open access policy (all papers should be placed on a public site within 12 months of publication), as well prohibit the government from obtaining a license to post such works on the internet.

Time to call your congressman, or use the (in)famous Obama outreach program.

Sunday, February 15, 2009

A new open access CG journal

The last few years has seen many attempts by researchers to break free from the shackles of (commercial) journal publishers. There's the whole-sale exodus that produced the ACM Transactions on Algorithms, as well as other journals, and technology aided creation of open-access free journals like the Theory of Computing. There's also a movement to create an open access computational linguistics journal, spearheaded by my colleague Hal Daume here at the U. of Utah.

Joachim Gudmundsson and Pat Morin have been investigating the feasibility of making such a journal for Comp. Geom, motivated by costs, and copyright issues with current journals. Here are posts one, two and three on the topic.

They've worked out most of the logistical issues involved in creating such a journal, and are now trying to reach out to the community to see what kind of interest there is. After all, the main currency of a journal is its reputation, and that comes from community participation (and then perception). So if you have any opinion on the matter, hop over to Dense Outliers, take the poll and post a comment (don't post comments here).

My personal view: I think open access is a great idea in principle, but I'm not seeing a pressing need within CG itself for such a journal at this point in time. (Disclaimer: I'm involved with the International Journal for Comp. Geom and Applications).

Sunday, February 08, 2009

Making sausage

No, not that kind.

By now, many of you have probably heard of the massively collaborative math experiment being conducted by Timothy Gowers on his blog (with offshoots on Terry Tao's blog). The idea is to mount a serious attack on a conjecture in combinatorics called the density Hales-Jewett conjecture (go to Gowers' site for more details).

Michael Nielsen points out something that I had been thinking about while perusing the initial thread: this is an EXCELLENT way to show students how research gets done. I remember a while back that Sean Carroll from Cosmic Variance had done a post explaining how one of his papers got written, but the post-facto description lacked the immediacy and the messiness of a usual research process, and probably (just even through faulty memories) even missed out on some the paths not taken in the course of the research.

Seeing research done "live" as it were, by professional high-caliber mathematicians, is as exciting as watching live professional sports, and is even better in the sense that you see the the false starts, the high level strategizing and plans of attack, the multitude of possible ideas that get formed, and even the growth of more stable, promising lines of attack on related problems.

One of the things I'm pondering right now is the best way to show students how research is done, and this is a great example to illustrate the messy, convoluted, and yet highly sophisticated ways in which experts ply their trade.

Saturday, January 31, 2009

O'Rourke's Art Gallery book

Joe O'Rourke mentions that his classic on art gallery problems is now available for free online. Download it now ! And while you're at it, buy his book on folding (with Erik Demaine). I just bought the folding book (and have already assigned one class project from it).

Wednesday, January 28, 2009

Levels of hell (heaven?) when writing practically motivated theoretical papers

I was going to post this as a comment on Michael's post, but it started getting longer and longer.

The main question being discussed there is: how do you balance the theory and practical sides of your work effectively from a point of view of getting and keeping faculty jobs ? it's good to remember that not every problem is amenable to a joyous merge of theory and practice: There are levels of hell (heaven?) involved here, that go something like:

* prove fundamental new result, and this leads to breakthrough implementation for a problem people couldn't solve (this happens usually in an area relatively untouched by theory thus far, and can really make you famous) (I'd imagine RSA/Diffie-Helman fall in this category)
* Brand new result: leads to improvements in efficiency AND accuracy of known methods by orders of magnitude
* Brand new result: improves on efficiency OR accuracy of known methods, by orders of magnitude.

Below this line, you're unlikely to get a theory publication out of the contribution:

* Observation that known approaches (or derivations thereof) lead to improvements in efficiency AND/or accuracy by orders of magnitude
* New theory result, some improvements in efficiency AND accuracy

And here's where it gets positively hellish:
* mildly new theory result, reasonable improvement in efficiency and accuracy, but not orders of magnitude, and you go up against an entrenched, highly optimized heuristic (k-means, anyone ?)

At this point you really have to choose which you care about, the problem or the theory, and then branch out accordingly. Papers in this last realm are really difficult to publish anywhere, even when they nontrivially improve the state of the art.

Thursday, January 15, 2009

ACM Fellows, 2008 edition

ACM has announced its Fellows for 2008. Familar names on the list include:

In the related area of game theory, Xiaotie Deng and Tuomas Sandholm were honored as well. Congratulations to all the winners !

(HT: Michael Trick)

Wednesday, January 14, 2009

A "Green" conference...

(Update 2/17/09: Workshop dates changed to Apr 9-10)

Kirk Pruhs asks me to mention what I'll call the first "green" workshop of 2009:

Workshop on the Science of Power Management (Mar 26-27)


The rapidly increasing power consumption associated with information technology (IT) results in a myriad of adverse impacts including high utility costs, unsustainable thermal densities, poor space usage, and a substantial carbon footprint. In view of this, reducing IT power consumption has become a pressing priority at all levels of computer technology from semiconductor materials and processes all the way up to the design of entire data centers.

Although there is already significant research and development activity around power management in academia and industry, power remains a very difficult subject to deal with in a scientific and formal way. A grand challenge for the scientific community is to understand the trade-offs between power consumption and computability at a much more fundamental level. Ideally one would like to determine the limits of computability under power/thermal constraints, design systems that approach these limits and quantify corresponding performance tradeoffs.


This is a "NSF CFP generating workshop": the idea is for the workshop to generate material for a funding call. For more info, do contact Kirk, or Sandy Irani.

Wednesday, January 07, 2009

SODA news: American Association of Chiropractors in a deep funk...

As promised, there was no hard copy proceedings at the conference this year, causing spinal columns everywhere to heave a huge sigh of relief. What's even more amazing is that ALL the papers from the conference are online: I downloaded them all last night ! Bill Gasarch has this, and more on why he enjoyed SODA, which apparently disturbed Lance so much he had to snark in the post immediately after :)

p.s the comment linked above points out that Las Vegas was the "first choice" for SODA in 2010, even though Austin was the "realistic second" that ultimately won.

Monday, January 05, 2009

SODA location news

No Salt Lake City, alas: however,
SODA 2010 in Austin, Texas, and SODA 2011 in Paris, France (of course unless SIAM and SODA steering committee will change the outcome of the majority vote, what is most likely to happen, in which case it should be San Francisco)
Somehow, I can't see David Johnson the SODA steering committee agreeing on Paris.

Update: More on the vote: it was Paris (60), the Virgin Islands (!) (50), San Francisco (46) and SLC (44). Clearly, I should have attended SODA and brought a student with me :)

On a related note: Virgin Islands ? What, have we given up on Puerto Rico as the token 'ain't gonna happen in a million years' location ?

Update: see the comment by Luc Devroye for some pushback to my claim that VI is an unreasonable location.

Sunday, January 04, 2009

SODA Days 0/1...

That's it for now: come on, CS bloggers !

Saturday, January 03, 2009

no SODA blogging :(

I'm not at SODA/ALENEX this year, so no blogging :(. If anyone is blogging/tweeting/facebooking from the conference, let me know and I'll post a link here. Michael Lugo from God Plays Dice will be at ANALCO, and will hopefully have more on the 'Impatiemment Attendu' :)

Friday, January 02, 2009

Flying While Brown, in 2009...

No beards, no scarfs, and DEFINITELY no discussion of safe places to sit:

Mr. Irfan turned to his wife, Sobia Ijaz, as they boarded AirTran Flight 175 at Reagan National Airport near Washington Thursday afternoon, and wondered aloud where the safest place to sit on the airplane would be — the front? The rear? Over the wing? 

But passengers sitting behind them evidently overheard the remark, saw Mr. Irfan’s beard and his wife’s head scarf, and grew concerned. Mr. Irfan and his wife, along with six members of their extended family, are Muslims, and were on their way to a religious conference in Orlando when they boarded the flight.

The worried passengers contacted flight attendants, who contacted Transportation Security Administration officials, and soon, Mr. Irfan and his wife were off the plane and being questioned in the jetway. The six remaining family members in the traveling party were taken off the plane as well, along with a family friend who happened to be on the same flight and who happens to be a lawyer for the Library of Congress. 

Next, the nine Muslim passengers — all but one are United States-born American citizens — were taken to a quarantine area in the passenger lounge where they were questioned by F.B.I. agents. Mr. Irfan’s three small nephews were denied access to food in the family’s carry-on luggage. 

Before long, Mr. Irfan told The Lede in an interview Friday morning, the F.B.I. concluded that the incident was obviously just a misunderstanding, and told AirTran officials that the family was cleared to travel. But he said AirTran still refused to rebook them, offering only to refund their tickets. The F.B.I. agents helped the family get on a later USAirways flight to Orlando, but those seats cost them twice as much.


Happy new year, same as the old year.

Sunday, December 21, 2008

More experiments in algo-teaching

(ed. note: think of this as a 'theory'-chaser to take the bad taste of cricket-blogging out of your mouth)

Another experiment that I continued this time was a set of lectures on "fun topics". The idea is to convey some ideas about a fun topic in theoryCS without too much jargon, but with enough meat to indicate that there's something deeper going on beneath the fun stuff.

I run these lectures at the end of the semester (when everyone's worn out already :)), and throw them open to the general public: both times I did this, I got substantial attendance from people not in my class, and at least in one case, someone who attended the lecture last year actually decided to take my class this year.

Not all topics are amenable to such a treatment, but my hope is to build up a suite of such lectures: the nice thing is that they can then be reused for other venues: for example, I've given variants of these talks to freshmen and high school students, and will do an undergraduate colloquium in the math department next semester.

For all of these lectures, I've pilfered shamelessly from the original works, as well as great websites developed by the authors. This post can be viewed as a shout-out and a thank you.

1. Pancake flipping:
Brian Hayes wrote a great article on pancake flipping for the American Scientist. In it, he links it to the problem of genomic rearrangement, and in doing so, ends up with a beautiful tale of the interaction between theoretical problems and practical constraints, all told in context of a topic that everyone can relate to.

I made two-color pancakes in different sizes, and distributed them to the students at the start of class: I briefly described the problem, and let them go at it. It was quite entertaining, and put the more formal discussion later on in context.

2. Zero knowledge proofs
ZK proofs are great for this kind of setting: they are completely counter-intuitive, (and so have the 'bizarre' factor), and are easily explained using popular metaphors. I used Moni Naor's Sudoku demo page, as well as the version of the proof that involves slicing and dicing the puzzle. This was done with class participation (I was the prover, and there were multiple verifiers).

The second demo I ran was the Yao protocol for the Millionaire's problem (how do two millionaires determine which is richer, without either knowing the worth of the other). Again, I did this interactively with class volunteers and an RSA applet to speed things along. More details here.

3. Quantum Computing
No demos for this one, but I gave a crude high level view of what quantum computing is about (about one-step up from the "do everything in parallel" explanation). The main goal here was to convey the basic ideas of what a qubit is, what a quantum circuit looks like, and how the Bell inequalities show that quantum computing is much more bizarre than classical randomness. Here, Dave Bacon/Umesh Vazirani lecture notes proved indispensable, coupled with the Nielsen-Chuang book, and a recent blog post by Michael Nielsen.

4. Computational Origami
I haven't quite worked the kinks out of this one, but the basic idea is to demonstrate the principles behind computational origami (and where the 'computation' comes in) by looking at the question: What shapes can you make by folding, followed by a single cut.

There's a cool backstory to this: essentially the first example of such an algorithm was how Betsy Ross designed the 5 point star for the American Flag, and it lends itself to a nice demo in class.

Secondly, it's a classic example of resource-bounded computation: limiting yourself to one cut. Thus, it makes for a good illustration of how computation appears in all kinds of problems.

Thirdly, computational origami actually shows up in many real-world problems, most notable one involving folding mirrors for a telescope to be launched into space. If we're trying to convey a sense of computational thinking, this is a great example.

The actual problem itself was solved in this paper by Demaine, Demaine and Lubiw. Unfortunately, I have yet to find a way of explaining it that will make sense to non-expert geometers, and that's one weakness with this particular story: Erik's page has some nice examples to demo, but it's hard to convey the deeper algorithmics behind the problem.

Lessons learned:
  • As always, know your audience. What works for a graduate class doesn't work for sophomores, and certainly doesn't work for eigth-graders :)
  • Interactivity is key: if you allow people to participate, they're more involved, and will probably take away something positive
  • Keep it light: resist the urge to get too mathematical. There are many beautiful topics in theoryCS that can be explained without jargon, and many others for which with some effort, jargon can be removed.
  • A 'bizarro' factor helps: In general lectures that I've given, I find that presenting something completely counter to people's expectations catches their attention immediately, and leaves a lasting impression. ZK proofs have that property. Bell's inequality sort of does, but the impact would be more immediate if I could actually run a quantum experiment to demonstrate violation of Bell's inequality ;)
  • Relate it to the real world: again, it's not enough to cite applications: one should try to show them as far as possible. In this regard, the Betsy Ross story is great, because it relates to something everyone (in this country) knows.
My goal is to add one or two such lectures to my arsenal each time I teach the class. For next time, I'm seriously considering running a live auction of some kind to demonstrate some concept in algorithmic game theory (suggestions on how best to do this are always welcome). Are there other such topics that might lend themselves to an entertaining, edifiying and educational experience ?

On run-chases

(ed. note: this is about cricket, not algorithms, or geometry, or computer science. you have been warned)

South Africa, the Netherlands of cricket, finally won a big game, beating Australia in Perth after a historic run-chase of 414. This of course follows India's famous run-chase, beating England in Chennai by chasing down 387. As Cricinfo points out, 9 of the top 25 run chases have come in the last 8 years (in a tally that dates back to 1902).

There's a detailed statistical analysis of these chases, but no speculation as to why they're becoming more frequent. The answer seems obvious to me though: the increasing scores in one-day cricket. A quick search of Statsguru indicates that of the 258 overall scores above 300 (first or second team) one 1-day games, 179 of these happened after 2000. Even to a casual observer, it's clear that the scores in 1-day games have gone up (and don't get me started on Powerplays).

Frankly, when all this fuss was being made about India's run chase, I couldn't quite understand why, because if you think of this as a one-day game, it's not too hard (and in fact India's coaches thought the same way!).

All in all, two exciting Test matches (and how often have we been able to say that)

Disqus for The Geomblog