Wednesday, August 31, 2005

More numerology

The h-index and its cohorts are proposed measures of individual research impact, measured by citation counts. More traditional measures include things like citation counts, and number of high quality journal/conference publications. Derek Lowe at In the Pipeline has been blogging about the Impact Factor, the number du jour for measuring the "quality" of a journal. It is defined as the number of citations to articles in a journal divided by the number of papers in the journal (computed over a window of two years).

Apparently, it is now a popular game for journals to try and ramp up their IF (for example, a journal that many review articles will generate very high citation counts). This has caused much angst among researchers, because like any "objective" system, the IF can be gamed to the point of meaninglessness.

It is probably no surprise that we search for "measurable" indicators of quality, whether they be paper ratings, conference acceptance rates, individual impact, journal quality, or what-have-you. On the other hand, I doubt there is anyone who actually takes such ratings seriously (clearly we pay attention to the numbers, but always with the disclaimer that "true quality can't be measured in numbers"). It must be peculiarly frustrating to people in the "hard" sciences (and I will take the liberty of including theoryCS in this group) that we attempt to find objective truths, and yet our attempts to evaluate the quality of our work are so fuzzy and ...(this is hard for me to say..) subjective.

I wonder if our brethren (sistren?siblingen?) in the "not-so-hard" sciences are better able to come to terms with this.

Tuesday, August 30, 2005

Fun with numbers

In all the recent discussion of the h-index (my GSH is 11), it's worth noting that we need some kind of normalization by community. As any one who's ever written a paper in databases will tell you, it's the best way of getting a HUGE citation count. Why ? Because the field is a lot bigger, and the odds are that many people will end up citing your work. For example, my most cited paper is an old piece of work for the WWW conference.

Maybe you could normalize the index by dividing by the maximum.

Monday, August 29, 2005

funny...

Via Chris Leonard, the FETCSWCHG:
If you know the difference between P and NP, and the difference between offsides and icing; if you can properly pronounce Dijkstra and Jagr, Euler and Roy -- this is the page for you.

With sufficient encouragement, we will acquire ice time for the First Ever Theoretical Computer Science World Cup Hockey Game. Please fill out the form below to help us organize this unprecedented and quite possibly unique event.


League Commissioner (but she really wants to coach): Catherine McGeoch, Amherst College ccm@cs.amherst.edu
President of the Players Union: Cliff Stein, Columbia University cliff@ieor.columbia.edu
President of the Players Intersection: Micah Adler, University of Massachusetts micah@cs.umass.edu
We might even get to see the STOC 2006 PC Chair pad up :).

This has never happened to me

but I wish it had :):

Wednesday, August 24, 2005

arxiv and trackbacks.

By now, everyone is probably aware of the fact that the arxiv supports trackbacks. In theoryCS (excluding the quantum folks), we don't use the arxiv that much (with some notable exceptions), so this might not affect us greatly, but the physics bloggers are rather happy. It was interesting to read this at Cosmic Variance:
Most people these days post to the arxiv before they even send their paper to a journal, and some have stopped submitting to journals altogether. (I wish they all would, it would cut down on that annoying refereeing we all have to do.) And nobody actually reads the journals — they serve exclusively as ways to verify that your work has passed peer review.
Now that would be a neat model to adopt.

p.s I personally am not that enamored of trackbacks. Blogger doesn't support them, trackback spam is rampant, and they can be cumbersome to use. Technorati "who links to me" pages are often more effective. But they do provide a (semi-)automatic comment mechanism that allows for discussions of papers to be carried out in the blogosphere, so it will be interesting to see how effective this is.

Saturday, August 20, 2005

Data Mining and the GPU

I was planning to be at KDD tomorrow to co-present a tutorial on data mining and graphics hardware, but as I am rapidly learning, man proposes and baby disposes ! In any case, if you are in the area, my very able co-conspirators Shankar Krishnan and Sudipto Guha will be pulling extra weight for me.

Mattress flipping

I am told that sleep is something that people do from time to time. I am further told that it is possible to sleep flat, on a structure called a mattress. If sleeping on mattresses is something you do on a regular basis, you might be interested in an amusing reflection on the group theory of mattress flipping.

(Via Ars Mathematica)

Friday, August 19, 2005

I'm back..

A somewhat forced vacation, caused by a 50% increase in family size (sometimes, constants DO matter !). While I recover, look over this set of recommendations on giving scientific talks, by Robert Geroch.

Whenever I see suggestions on how to use powerpoint effectively, I see these inane comments like "never use more than 6 words per slide" and what-not. For high signal-to-noise ratio presentations like at conferences, this hardly makes sense. The suggestions given above are remarkably intelligent, and are directed towards scientific talks. The fact that this was written in 1973 only goes to show that the principles of good public speaking are timeless.

Friday, August 05, 2005

DOI and BibTeX

For the last few hours, for reasons that I will not get into, I have been trying to track down bibtex entries for papers. Usually if the paper has an ACM DL entry, there is a bibtex entry that one can web-scrape, but for many papers (especially IEEE publications), this doesn't work because IEEE doesn't have bibtex entries on their website (and it's harder to web-scrape them).

Most of the complication comes from the fact that often I have a title, and need to match it to an actual citation of some kind. Google Scholar is quite helpful in this regard, allowing me to search for a title and more often than not returning the ACM DL link to the paper (and BibTeX entry).

But the ACM doesn't have everything, and this is where DOI numbers come in. The Document Object Identifier is a unique identifier that maps to a document entity, analogous to the URL for a web page. Similarly to a web page, the actual location of the document can be hidden from the user, and changed easily by the publisher, allowing for both portability and the ability to integrate a variety of sources. There is even a proxy server that you can supply a DOI number to; it returns the web page of the publisher that currently maintains that document.

What would be very cool would be a DOI to BibTeX converter. Note that a BibTeX entry maps to a single document, like a DOI. DOIs of course address a smaller space, since they govern only published work. If publishers exported some standard format (XML?), then it would be a trivial matter to write such a thing. Right now, all you get is the web page, from which you either have to scrape a bibtex, or construct one by hand. Neither options scales or is particularly appealing.

Thursday, August 04, 2005

geeks who eat...

Anyone who's ever sat through any theoryCS conference lunch/dinner knows that as the meal wears on, the probability that someone will make some silly joke about dining philosophers tends to one.

I am therefore gratified to see the possible emergence of another dining problem in a different community, in this case statistics:
2 groups of statisticians want to lunch together, but have managed to travel to 2 different restaurants. There is a third similar restaurant nearby. In fact the 3 restaurants are equidistant; it takes exactly 5 minutes to move from one to another. The statisticians have dashed to lunch without their cellphones, and don't know the phone numbers for any of the restaurants, so they can't communicate. Having a shocking faith in randomness, they have devised a technique for joining up. Each group will wait a random time--exponentially distributed with a mean of 5 minutes--and then move to another restaurant, selected randomly (of course) with a coin toss. They either meet, or repeat the process. What is the probability that one group will meet the other after moving only once?

Wednesday, August 03, 2005

What you need is a degree in acting, not a Ph.D...

The Pentagon's new goal:
Fewer and fewer students are pursuing science and engineering. While immigrants are taking up the slack in many areas, defense laboratories and industries generally require American citizenship or permanent residency. So a crisis is looming, unless careers in science and engineering suddenly become hugely popular, said Robert J. Barker, an Air Force program manager who approved the grant. And what better way to get a lot of young people interested in science than by producing movies and television shows that depict scientists in flattering ways?
What better way ? Hmm, maybe you could (cough) (cough) give scientists some (cough) grant money ?
Tucked away in the Hollywood hills, an elite group of scientists from across the country and from a grab bag of disciplines - rocket science, nanotechnology, genetics, even veterinary medicine - has gathered this week to plot a solution to what officials call one of the nation's most vexing long-term national security problems.

Their work is being financed by the Air Force and the Army, but the Manhattan Project it ain't: the 15 scientists are being taught how to write and sell screenplays.
Oh I see. So instead of encouraging scientists to write grants, we'll get them to write $%$%$% screenplays. Admittedly, grant writing is an exercise in creative fiction, but to go this far ? And don't you read blogs ?
There aren't many stirring stories of heroic derring-do in which the protagonist saves the world and/or gets laid thanks to the well-timed development of a more efficient word processor file format translator. Sure, I've heard a few, but not many, and I doubt you'd want to hear them.
Oh and did I mention that Numb3rs will be back on the air Sep 9 ?

Frame this on your wall...

Because it's the most eloquent eulogy for computer science that I have ever read:
We see power and beauty in computer science, even while we rage against the limitations of the technology that grows out of it. We see our field not (just) as a way to make boxes that beep, but as a fundamentally new way of thinking about the world. We are craftsmen, taking great satisfaction in the structures we build. We drag abstractions kicking and screaming from Plato's cave, and we make them real. We are explorers, proud of our hard-won discoveries but humbled by the depth of our ignorance. We have changed the world, utterly and irreversably. Our influence on your daily life may be less immediate than the influence of doctors, lawyers, politicians, bankers, and soldiers, but it is no less profound. And we are just barely getting started.

Tuesday, August 02, 2005

Now this is how a business meeting should be :)

Excerpts from Dennis Overbye's summary of the panel discussion at Strings 2005:
  • The panel discussion, titled "The Next Superstring Revolution," took place at Strings05. [...] That it was a somewhat unusual occasion was not lost on anyone. No other field of science, said Jacques Distler, from the University of Texas, would be presumptuous enough to have a meeting about its next revolution.
  • Dr. Shenker [...] had recruited his panel with an accent on youth, on the ground that the progenitors of the previous revolutions were unlikely to be the makers of the next.
  • At the end Dr. Shenker invoked his executive privileges. He asked the audience members for a vote on whether, by the year 3000, say, the value of the cosmological constant would be explained by the anthropic principle or by fundamental physics. The panel split 4 to 4, with abstentions, but the audience voted overwhelmingly for the latter possibility."Wow," exhaled one of the panel members amid other exclamations too colorful to print here.
  • Dr. Shenker concluded, "We have made some progress in sharing our feelings."
But was there beer ?

Monday, August 01, 2005

0xDE

David Eppstein, who already maintains the amazing Geometry Junkyard, and has the best publications web archives I have ever seen, has now "joined the ranks of the blogged". Looks like geometry bloggers are now dominating the algorithms community :).

David also has the distinction of being the only (one of the only?) theoretician to be portrayed in a comic strip.

Disqus for The Geomblog