Wednesday, March 04, 2009

DBR: moving forward

I have no doubt that people are sick of this subject now, so I'll try not to rehash or re-argue points that many have brought up. It seems to me (as a supporter of DBR) that the objections to DBR can be categorized as
  • self-aware: "we are not biased. period. everything is good".
  • conservative: "DBR could cause other problems: why replace one flawed system with another"
  • logistic: "authors could slyly reveal info, how do we handle self-citations etc"
  • irritated: "why are you people rabble rousing: leave us alone"
In this regard, I think people should really try to read Kathryn McKinley's essay on this topic (and the related links). There's much there for all: for us utopian DBR devotees, she points out that the most effective kind is a appears to be a staged unblinding approach, rather than a straight DB approach. For those who think that we can check our own biases, she provides references and evidence for why this goes against what we know about human psychology. For people concerned about logistical issues, she discusses many of the common problems (and also references other disciplines that have made their own attempts to solve this).

A comment I read somewhere (union of Sorelle, Lance, Michael and myself) made what I thought was an excellent point: if people are really committed to trying out DBR, it might be good to experiment in a conference outside the big ones, so we can acquire some level of familiarity with the system (or realize that it's totally useless). As I had mentioned to Sorelle, this ultimately boils down to having a PC chair who wants to experiment in this way: Michael tried doing serious CoI at the STOC PC meeting and received no small amount of flak for doing so, but at least he tried, and it convinced him that it should happen even more.

More on the dynamics of peer review (this at the level of funding panels) comes from a new book reviewed in IHE. The review reveals some of the key findings: nothing entirely surprising, but a clear indication that many "non-technical" factors go into a proposal getting funded (even things like who has a plane to catch and when).
It's reading things like this that makes me impatient with people who claim that bias doesn't exist.


  1. Quoting from the McKinley essay that you cite:

    Review committee. In addition to program committee reviews, many SIGPLAN conferences obtain ad hoc outside reviewers on a per-submission bases. The goal of the outside review is to generate a thoroughly expert review. With single-blind reviewing, the process of selecting ad hoc reviewers can be distributed among committee members. With double-blind reviewing, the same process is very error prone. For PLDI 2007, PLDI 2008, and ASPLOS 2006, the program chair took on this task, which consumed an enormous amount of time and email bandwidth.

    I recommend instead a formal review committee to solve this problem, as pioneered at ISMM 2008 by Steve Blackburn. The program chair selects the review committee to complement and extend the expertise of the program committee.

    What this says is that the current process of having the PC members individually send out papers for sub-reviews doesn't work with DBR.

    How do you reconcile this with the fact that this is the dominant modus operandi of all large theory conference PCs? Are you suggesting that we abandon this idea?

  2. The Kathryn McKinley's essay is not very convincing. If this is your strongest case, you might want to reconsider. Utupia is a weak motivation to change a system that is the worst system, except for all the others...

  3. It seems like in order for this to happen there needs to be a consensus or broad majority of support in the community (unlikely) or a PC chair needs to take it on, as you mentioned. I'm much more in favor of making consensus-style changes (perhaps because I went to a Quaker college...), so I'm skeptical about a PC chair unilaterally deciding to do this. Perhaps there's a compromise in that a PC chair could decide to do this before recruiting a PC and so all the members could be on board from the beginning?

  4. sorelle: in truth, even the PC Chair shouldn't do anything with a 'sense of the community'. I'd propose bringing up a discussion item at a business meeting with the idea that the PC chair gets a sense of where people stand, and is given leeway to make a decision if there's sufficient support.

    Anon1: Indeed, which is why even though I support DBR, I am aware of the logistical issues involved, and am not jumping to change things on a dime.

    Anon2: can you be specific about what you found unconvincing ? it's not helpful to merely dismiss an entire chain of reasoning without pointing out why ?

  5. So far, the only argument in favour of DBR is that it helps to eliminate bias. However, no one has provided any evidence that, in theory conferences, bias exists except for this kind:

    A) Authors with a history of doing strong, important research, are more likely to get a mediocre paper accepted.

    B) Authors with a history of doing bad research (or no history at all), are less likely to get a mediocre paper accepted.

    Maybe you could replace the word "Authors" with the phrase "Authors from institutions with" in the above two points, but I think even that is a stretch.

    From the comments, most people don't have a problem with that kind of bias. The papers that it affects are the ones that are mostly interchangeable anyway. A fairer (and easier to implement) solution might be just to accept a random selection of the borderline papers.

  6. Pat: I think we can take it as a given that bias exists, regardless of whether we see it ourselves.

    The question to me is not "would double blind reviewing reduce bias" (of course it would) but rather what are the full costs and benefits of the system?

    McKinley mentions two benefits: reduced bias, and better decisions as measured by impact of the accepted papers. But we shouldn't forget that DBR has costs, too: slowed progress due to less-open exchange of information, elimination of some potentially desirable biases (e.g. being more favorable towards student authors), difficulty of handling DBR within the theory PC-subreviewer system.

    It would be good if we could somehow quantify these costs and benefits rather than fixating only on unfair bias as if it were the only variable changed by DBR.

  7. Pat: I think we can take it as a given that bias exists, regardless of whether we see it ourselves.

    I agree that there's no chance the bias is literally zero, but it can be hard to estimate even which direction the bias goes. For example, which effect dominates: people dimissing the work of unknown students or people going out of their way to encourage them? I don't know, and I doubt anybody else does either.

    There are other forms of bias I'm more confident of. For example, I'd be very surprised if there's a bias in favor of women, so we can at least predict the direction of that one. However, I suspect the magnitude is small.

    The question to me is not "would double blind reviewing reduce bias" (of course it would)

    Well, personal or institutional bias wouldn't likely increase, but it's not clear how much it would decrease, even aside from widespread breaches of anonymity from talks and preprints.

    For example, what causes bias against women? One obvious factor is seeing a woman's name at the top of the paper. No decent person would take this into account consciously, but unconscious influences can be strong.

    On the other hand, bias could enter in other ways as well. For example, female researchers are stereotypically underconfident and reluctant to engage in self-promotion. This can (and presumably does) work against their submissions, regardless of whether a female name is attached, and I'd guess it accounts for a larger fraction of the total bias than seeing the name does. Furthermore, seeing the name might actually help. There are two explanations for an understated introduction: it could be the author's personality (in which case a rational reviewer won't care) or it could be because the results genuinely aren't that impressive. Seeing that the author belongs to a group that is less likely to engage in self-promotion may lead the reviewer to judge the paper more fairly.

    I don't want to suggest SBR actually helps women on average; presumably it doesn't. However, the dynamics are complicated and I don't believe DBR systematically eliminates bias across the board.

    It would be good if we could somehow quantify these costs and benefits rather than fixating only on unfair bias as if it were the only variable changed by DBR.

    Part of the profound difficulty is that some of the costs and benefits are almost unquantifiable. For example, if DBR broadens the appeal of theory and decreases the perception of bias, then that's a benefit even if the actual bias remains completely unchanged. If DBR upsets people or adds overhead, then that's a cost regardless of how much it decreases bias (or whether you think people ought to be upset). In fact, I believe the emotional costs and benefits are almost certainly much greater than the substantive effects, but I have no idea how to balance them.

  8. "For example, I'd be very surprised if there's a bias in favor of women, so we can at least predict the direction of that one."

    Given how desperate computer science is to attract women, I'm not sure about that. Yes, I know there have been studies in other fields, but I don't know that any field is comparable to CS in its severe gender disparity.

  9. My complaint is that Kathryn McKinley's essay has mostly circumstantial evidence that does not necessarily support's the author argument. For example, conferences with DBR might have higher citation index, because these conferences are so large and popular that such a system makes sense (my claim is that both theory/CG are too small to require DBR). Also, large fraction of her examples in the DB community which is way bigger - so you have to normalize for size, etc.

    As other comments mentioned - the existence of bias is undoubtful. What is more doubtful is its bad impact. Show the damage in the current system, argue why DBR would fix them, and that changing things would really improve things. Frankly, despite the extensive discussion, I am not sure these things were demonstrated.

    I am curious if people think that DBr would improve the probability of their papers to get accepted? I somehow feel it would have zero impact on my research since I would have all the papers available online on my webpage anyway...

  10. its generally a good idea to define your acronyms. Google says its "Distributed Bragg reflector"

  11. What about going the other way and making the reviewers names visible to the authors? For this not to increase the bias, the reviews can be made public. E.g., each paper on arxiv could have a little blog of signed comments/reviews.

    Indeed, it makes difference for the reviewer who the author is. But it also makes difference for the author who the rev is.

  12. The usual argument against making reviewers' identities public is that they will be more likely to be more forthright if their identities are secret. The countervailing view is that if their identities are public then reviewers are likely to try to be more careful and accurate.

    If the second view is correct, then applying it to submissions in DBR, you'd expect that anonymous authors would be much more ready to speculatively submit half-baked papers with the hope that they might just get in, since they'll have time to fix them up and make them presentable if they are accepted. If they needed to identify themselves, then they would use much more self-selection.

    I have seen precisely this behavior
    in submissions to the major AI conferences that use DBR.

  13. Two comments:

    1) With respect to biasing in favor of student papers, I actually think this is unfair. On the most recent PC I was on, we gave "affirmative action" at the end to student-only papers, which of course happened to come from major universities. What about two author (student+one other person) papers from universities in which there are not very many students? In other words, this policy is ad hoc, carried out without much thought, and is not necessarily fair. Also, many other students got their papers in in the first rounds, and these student-only papers from big institutions also get in at the end, since that is where the majority of student-only papers come from. Not really fair. So I don't think that that is a benefit to SBR.

    2) With respect to self citations, a big problem is actually that people submit papers without adequately explaining the relationship of this paper to the own previous work. For example, an author has a previous paper on essentially the same problem and in the current paper, they modify/generalize their technique. Without explaining relation to previous work, the paper is often at an *advantage* if it does not receive a thorough review (which is often) because it will appear new, etc. So in many cases, people are not sufficiently using self-citations anyway even with SBR.

  14. "Given how desperate computer science is to attract women, I'm not sure about that. Yes, I know there have been studies in other fields, but I don't know that any field is comparable to CS in its severe gender disparity."

    Yes, CS attracts women to Grad school and then what? I've never seen action on a PC to accept papers by women (and I'm not saying this should occur). But I do not see, aside from getting into grad school, attempts that would present bias *in favor* of women.

  15. One argument that I have heard in person but not in these loud pro-DBR blog posts is that "I do not want my work to be distributed without my name on it." This definitely applies to me. I think that a substantial amount of reputation, both improvement and degradation, is established by what papers you submit. DBR encourages poor submissions because they don't hurt the author's reputation, and removes the sense of pride from good submissions. It also makes it much more likely, in my opinion, for people to not correctly remember who did what, and wrongly attribute work to people. ("Oh, it must have been so-and-so who always works on that problem" becomes likely instead of the possibility of "Wow, I didn't know so-and-so was working on those problems too".)

    From the other side, much of my exposure to the literature is through refereeing, either on PCs or for journals. I definitely would not want to see results without knowing who did them; otherwise I'd never know who is up to what. I know that reviewed papers are kept secret, but I simply don't have time to read many papers in other contexts, so with DBR, I won't know who to look up later if I encounter something related. I would definitely refuse to be on a PC with DBR. And if there are people like me who learn about the literature through reviewing, then conversely as an author I don't want to submit my papers to places where I don't get that publicity.

    Suresh's previous post asked "what's the harm to authors". In my mind, these are the main harms, to both authors and PC members.

  16. "DBR encourages poor submissions because they don't hurt the author's reputation, and removes the sense of pride from good submissions."

    The staged unblinding that Kathryn McKinley proposes gets around this problem as the authors know that eventually the PC will know who they are.

    Now, initially I thought the staged unblinding made DBR pointless. Having just served on a PC where it was used, I realized that the goal of DBR is to get the reviewers to _evaluate_ the papers based on the _text_ in the 10 pages and free of any other preconceived biases that (invariably) affix themselves to one's mind upon learning the authors' names. As each paper was discussed and the authors unblinded, I couldn't help but thinking that I would have written a less unbiased review had I known who the authors were beforehand.

    -Ranjit Jhala.


Disqus for The Geomblog