Sunday, May 28, 2006

On scores for papers: A postscript

The SIGACT reviewing system asks reviewers to assign scores when reviewing papers. It goes something like this (lightly edited):
  • 9-10: An enthusiastic yes. I will fight strongly for this paper.
  • 8-8.99 A strong vote for acceptance. A solid contribution. This paper should be in the top third of the papers in the conference.
  • 7-7.99 A vote for acceptance. Not a stellar result, but clearly worth accepting.
  • 6-6.99: A weak vote for acceptance. A reasonable contribution to
  • an interesting problem - or maybe the contribution is good but
  • the authors don't seem to understand what it is and/or express it
  • well - or maybe it's a good paper, but the subject area is marginal
  • for the conference.
  • 5.0-5.99: Ambivalent. I might support accepting this paper, but I don't advocate accepting it. Probably publishable as a journal paper, but a bit too specialized or too incremental or perhaps it has nice ideas but is too preliminary, or too poorly written.
  • 4.0-4.99 A competent paper, but not of sufficient interest/depth.
  • 2.5-3.99: A solid vote for rejection. It is very unlikely that I could be convinced to support this paper.
  • 1-2.49 A strong vote for rejection. A poor paper, unsuitable for any journal.
  • 0.01-.99: Absolute reject. Completely trivial and/or non-novel and/or incorrect and/or completely out of scope.
More often than not, these scores are not revealed to the authors. And that's a good thing. The scores themselves are mostly placeholders for more intangible sentiments. Since ultimate selections are to a large degree relative, absolute numbers don't mean so much. Also, as anyone who's ever reviewed papers knows, high scores don't guarantee acceptance, and conversely.

I think it's a good idea that scores are kept merely as internal markers, and only provide light hints as to the final fate of a paper. I've reviewed papers where the individual review is given choices like 'strong accept, weak accept, weak reject, strong reject', and the categorical nature (as well as implied decision) of these verdicts make it harder IMO for a paper to survive an early dissing by a reviewer. Moreover, if you get a paper rejected with a set of reviews containing three weak accepts, it can be really puzzling to figure out what happened.

More importantly, not revealing scores prevents people from falling into the confusion that paper reviewing is a deterministic objective process; it most certainly isn't. However, at least in my experience, paper discussions can be extremely nuanced and sophisticated, far more so than a mere "objective" score can achieve.


Categories:

No comments:

Post a Comment

Disqus for The Geomblog