Thursday, November 03, 2011

Life in a crowd-sourced research world

(Sung to the tune of "War", and with a Jackie Chan accent for bonus points)

Jo-ur-nals !
What are they good for !
Absolutely nothing !

There are popular tropes in current discussions on crowd-sourcing research. There's the "scientists as Mean Girls" view of the current state of affairs. There's the utopian "Let a thousand papers bloom in the open research garden". There's the anti-capitalist "Down with evil money-grubbing publishers", and there's of course the always popular "Everything tastes better with crowd-sourced reputation points and achievement badges". 

But have we really thought through the implications of doing away with the current frameworks for "dissemination, verification and attention management" ?

Here's a tl;dr a la Cosma Shalizi: 

A more open research environment, where all work is published before review, and anyone is free to comment on any work in public without repercussions, is both valuable as well as  more chaotic and unpleasant than we might be ready for.

Consider a pure "publish-then-filter" world, in which you dumped your paper in a public repository that had commenting, reviewing, reputation features, achievement badges and whatever other technological goodies you wanted to throw in. 

You'd be in a world not unlike the world that writers and musicians live in today. Since big music/book publishers (read "journals") take a big cut of the royalty revenues ("journal subscriptions") in exchange for promotion/marketing (read "stamps of authenticity"), many authors and musicians have developed smaller but successful brands by going on the road themselves, doing online promotions, cultivating their fan base with special material, downloads, T-shirts, event tickets and what not, and relying on underground word-of-mouth to establish a presence.
Are you ready to do the same ?
It's naive to think that merely putting papers on a repository and waiting for attention to appear will actually work to disseminate your work. Attention is probably the most valuable resource available to us in this connected era, and the one most fiercely fought over by everyone. No one is going to even be able to pay attention to your work unless you promote it extensively, OR unless there are external ways of signalling value. 

If you think that reputation mechanisms will help, I will merely ask you to look at the attention garnered by the latest Ke$ha single compared to the attention given to <insert name of your favorite underground-not-selling-out-obscure-indie-band-that-will-set-the-world-on-fire here >

Secondly, I think as researchers, we would cringe at the kind of explicit promotion that authors/musicians have to indulge in. Would you really want to sell tickets for the "1.46 approximation to graphic TSP paper tour?". How would you afford it ? 

There's a third aspect to living in a crowd-sourced research world: a loss of mental space. While it should be clear to anyone who follows my blog/tweets/G+/comments on cstheory that I enjoy the modern networked world, it's also clear to me that actual research requires some distance.

In Anathem, Neal Stephenson describes a monastery of mathematics, where monks do their own research, and at regular intervals (1/10/100/1000 years) open their doors to the "seculars" to reveal their discoveries to the outside world.

Even with collaborations, skype, shared documents and github, you still need time (and space) to think. And in a completely open research environment where everything you post can be commented on by anyone,  I can assure you that you'll spend most of your time dealing with comment threads and slashdotting/reditting/HNing (if you're lucky). Are you ready to deploy a 24 hour rapid-response team to deal with the flaming your papers will get ?

Let me be very clear about something. I think there are many academic institutions (journals and conferences especially) that are in desperate need of overhauls and the Internet makes much of this possible. I think it's possible (but I'm less convinced) that we are on the cusp of a new paradigm for doing research, and debates like the ones we are having are extremely important to shape this new paradigm if it comes into being. In that context, I think that what Timothy Gowers (here and now here) and Noam Nisan (here and here) are trying to do is very constructive: not just complain about the current state of affairs or defend the status quo, but try to identify the key things that are good AND bad about our current system AND find a path to a desired destination.

But human nature doesn't change that quickly. New ways of disseminating, valuing and verifying research will definitely change what's valued and what's not, and can help open up the research enterprise to those who feel the current system isn't working (i.e most of us). But when you replace one evaluation system by another, don't be too sure that the new system is fairer - it might merely change what gets valued (i.e the peaks might change but not the distribution itself)


  1. I cannot predict the future for sure, but we may be able to avoid a future in which a lot of advertisement of papers is necessary. We already have models for publishing that are intermediate between the current for-profit publishers and reddit-style free-for-alls. PLOS and JMLR are examples here. Throw the bathwater out (restricted access), but not the baby (peer reviewing).

    On this topic, I just finished Michael Nielsen's recent book on "Networked Science"; it's a very well written and argued book, always reasonable in tone and very far from being a polemic.

    Your last sentence is very intriguing. I was immediatey reminded of Goodhart's law:

  2. Your music analogy doesn't quite fly: There is no analogue today of radio stations for academic publishing. Though it has completely changed how people acquire music, the internet hasn't fundamentally changed how most people find out about music.

    I agree with your sentiment about people not clearly thinking things through; however, things will certainly not stay the same. The era of print journals as cash cows is over. (For example, print journals are now big money-losers for IEEE.) Journals are now sold in big online bundles and it isn't clear that the economics of any one journal actually matters to a publisher.

    Assuming that journals survive what will be the right model? Once you don't have to print a journal, what limits its size? What are the incentives for journals to make choices of this sort?

  3. @Venu I definitely want to read Nielsen's book (in fact he'll be talking at Utah in the spring). I think he's really thought through the dynamics of networked science well, and I especially like his analysis of when crowdsourcing works and when it doesn't.

    @Paul Pandora (for internet music discovery) ? And I've always thought it would be neat to do an 'arxiv' podcast :)

  4. Oh! Plllleeeeassse!

    You don't want to get dirty with marketing...?

    Man! Have you internalized the culture so much that you don't even see the marketing anymore?

    A good half if not more of what we do is marketing. Every single time you talk at a conference, that's marketing. You are there, in front of a crowd, marketing your work. When you spend time rewording that sentence so that your work will sound more interesting, that's marketing! When you make sure that your paper "looks nice", that's marketing. When you post your papers online, that's marketing. When you tell a colleague "oh! I've got a paper on this..." That's marketing. When you organize a tutorial or a workshop relevant to your work, that's marketing.

    Professors are professional marketers.

    No matter how the system evolves, you'll be marketing your work in the future, probably increasingly so.

    The days when people could write a paper, and then throw it over the wall and forget about it are gone. You write your paper, then you promote it.

    It used to be that the journal would itself by the "promoting agent" but that won't cut it anymore.

    But marketing is not dirty. It is a productive activity for everyone. Good marketing creates value.

    If everyone did good work, but there was no good marketing, then it would sound boring and nobody would be interested in anything.

    I want you to sell your work to me. I want you to get me excited about what you have published.

    I cannot stand boring researchers... who are just glad they published something, anything.

    I want to see creative outbursts, I want to feel the deep insights... I want you to point them out to me.

  5. Daniel, it's a matter of degree and kind. Music is subjective and personal. Research has a subjective component (does this topic excite me) but also a more objective component (is this research correct, and how does it push the knowledge in the field forward). While it's legitimate to express one's views on both the "excitement" and the perceived importance of a piece of research, it's important to establish correctness outside the 'hype machine' so to speak. Consider the whole NASA-arsenic-bacteria matter as a case study in the perils of untimely marketing of science.

    Further, I view the pursuit of science as something with a more objective basis than (say) the pursuit of an agenda. Allowing the marketing balance to shift runs the risk of converting the scientific view into "just an opinion". We can see what's happening with perceptions of evolution and climate change as an example of this.

  6. I like the music analogy and I agree that dealing with HN/reddit style comments is an appalling prospect. Techniques for enhancing the S/N of comment threads are needed.

    I don't remotely buy "cusp of a new paradigm for doing research". Good research has always been, and will always be, done by individuals or fairly small groups of intense, committed individuals. Of course these people can now find each other much more easily but that doesn't feel like a paradigm shift.

  7. @Paul: Somewhat tangential, but actually quite a number of musicians these days become popular via the Internet, promoting themselves on YouTube/MySpace/etc. A big example of this is Justin Bieber (YouTube), but Googling reveals many more.

  8. Tree is the additional risk that I'm certain cases, these mechanisms are "attacked". Consider the possibility of climate science papers that Rush Limbaugh can ask his listeners to comment on. If my ability to get grants and/or tenure depends significantly on the filtering mechanism, I may feel compelled to work in a non-controversial area.

  9. Coming in late to the discussion... Why assume that journals and peer review would be abolished altogether, as opposed to being a layer of filtering applied after publication to arxiv, or something like it? Paul Ginsparg sketched out such a system quite some time ago, and while I wouldn't want to endorse it as the One True Way, it seems a step in the right direction. As Venu says above, something like JMLR is already quite far along this direction.

    I also think the analogy to musicians is very imperfect, because scientific researchers have institutional supports and structures (e.g., universities) which musicians do not. This makes the problem of connecting producers to audiences very different, and much easier.

    Finally, I learned about "attention conservation notices" from Bruce Sterling.


Disqus for The Geomblog