Monday, January 22, 2018

Double blind review: continuing the discussion

My first two posts on double blind review triggered good discussion by Michael Mitzenmacher and Boaz Barak (see the comments on these posts for more).  I thought I'd try to synthesize what I took away from the posts and how my own thinking has developed.

First up, I think it's gratifying to see that the the basic premise: "single blind review has the potential for bias, especially with respect to institutional status, gender and other signifiers of in/out groups" is granted at this point. There was a time in the not-so-distant past that I wouldn't be able to even establish this baseline in conversations that I'd have.

The argument therefore has moved to one of tradeoffs: does the installation of DB review introduce other kinds of harm while mitigating harms due to bias?

Here are some of the main arguments that have come up:

Author identity carries valuable signal to evaluate the work. 

This argument manifested itself in comments (and I've heard it made in the past). One specific version of it that James Lee articulates is that all reviewing happens in a resource-limited setting (the resource here being time) and so signals like author identity, while not necessary to evaluate the correctness of a proof, provide a prior that can help focus one's attention. 

My instinctive reaction to this is "you've just defined bias". But on reflection I think James (and others people who've said this) are pointing out that abandoning author identity is not for free. I think that's a fair point to make. But I'd be hard pressed to see why this increase in effort negates the fairness benefits from double blind review (and I'm in general a little uncomfortable with this purely utilitarian calculus when it comes to bias).

As a side note, I think that focusing on paper correctness is a mistake. As Boaz points out, this is not the main issue with most decisions on papers. What matters much more is "interestingness", which is very subjective and much more easily bound up with prior reactions to author identity. 

Some reviewers may be aware of author identity and others might not. This inconsistency could be a source of error in reviewing.

Boaz makes this point in his argument against DB review. It's an interesting argument, but I think it also falls into the trap of absolutism: i.e imperfections in this process will cause catastrophic failure. This point was made far more eloquently in a comment on a blog post about ACL's double blind policy (emphasis mine). 

I think this kind of all-or-nothing position fails to consider one of the advantages of blind review. Blind review is not only about preventing positive bias when you see a paper from an elite university, it’s also about the opposite: preventing negative bias when you see a paper from someone totally unknown. Being a PhD student from a small group in a little known university, the first time I submitted a paper to an ACL conference I felt quite reassured by knowing that the reviewers wouldn’t know who I was. 
In other words, under an arXiv-permissive policy like the current one, authors still have the *right* to be reviewed blindly, even if it’s no longer an obligation because they can make their identity known indirectly via arXiv+Twitter and the like. I think that right is important. So the dilemma is not a matter of “either we totally forbid dissemination of the papers before acceptance in order to have pure blind review (by the way, 100% pure blind review doesn’t exist anyway because one often has a hint of whom the authors may be, and this is true especially of well-known authors) or we throw the baby out with the bathwater and dispense with blind review altogether”. I think blind review should be preserved at least as a right for the author (as it is know), and the question is whether it should also be an obligation or not.

Prepublication on the arXiv is a desirable goal to foster open access and the speedy dissemination of information. Double blind review is irrevocably in conflict with non-anonyous pre-print dissemination.

This is perhaps the most compelling challenge to implementing double blind review. The arXiv as currently constructed is not designed to handle (for e.g) anonymous submissions that are progressively blinded. The post that the comment above came from has an extensive discussion of this point, and rather than try to rehash it all here, I'd recommend that you read the post and the comments. 

But the comments also question the premise head on: specifically, "does it really slow things down" and "so what?". Interestingly, Hal Daumé made an attempt to answer the "really?" question. He looked at arXiv uploads in 2014-2015 and correlated them with NIPS papers. The question he was trying to ask was: is there evidence that more papers uploaded to the arXiv before submission to NIPS in the interest of getting feedback from the community? His conclusion was that there was little evidence to support the idea that the arXiv had radically changed the normal submit-revise cycle of conferences. I'd actually think that theoryCS might be a little better in this regard, but I'd also be dubious of such claims without seeing data.

In the comments, even the question of "so what?" is addressed. And again this boils down to tradeoffs. While I'm not advocating that we ban people from putting their work on the arXiv, ACL has done precisely this, by asserting that the relatively short delay between submission and decision is worth it to ensure the ability to have double blind review.

Summary

I'm glad we're continuing to have this discussion, and talking about the details of implementation is important. Nothing I've heard has convinced me that the logistical hurdles associated with double blind review are insurmountable or even more than inconveniences that arise out of habit, but I think there are ways in which we can fine tune the process to make sense for the theory community. 

No comments:

Post a Comment

Disqus for The Geomblog