Wednesday, December 30, 2009

Reading papers.

It's been a bad few months for blogging, while I've been doing (shudder) "real work". Whenever people ask me how I make time for blogging, I always say that it takes less time than one thinks. This is generally true, but the last few months have been a blizzard of paper and proposal deadlines, and I just never got the time to sit down and pen coherent thoughts. I've also noticed that my tweeting is bleeding thoughts away from the blog: random throw-away comments that might been assembled into a blog post end up merely getting tweeted.

For the first time in a long time, I took a "true" vacation, where not even a laptop came with me (ok I took my iphone, but that's not the same :)). It was only a week, but I appreciated the complete downtime, especially coming over the frenzy of deadlines. It's been hard to get back into the swing of things, but it was a really good way to recharge.

But you don't want to hear about any of that. What you really want to hear about is..... my lament on how to organize paper reading.

When I was a grad student, life seemed simple. STOC/FOCS/SODA (and then SoCG) would come around, we'd peruse the paper list, chatter on about them, and figure out if someone had scooped us or not. Now, things seem more complicated. Especially since I'm the sole algorithms person at the U, I feel the need to catch up on the latest (and not-so-latest) hot topics in theoryCS, at least to keep abreast of things and share the latest news with my students. I also work in a number of more 'applied' areas, which means that there's a SIGMOD/VLDB/PODS/ICDE timeline to keep track of, and more recently a NIPS/ICML/COLT timeline, not to mention even more applied areas like ICCV/CVPR/MICCAI (more on that later).

There's a large and complicated taxonomy of paper reading: some of the main items are:
  • Papers I'm reading for technical material when working on a problem - this is the easiest kind to manage, because you have to read it RIGHT NOW.
  • Papers that seem related to the problem I'm working on, and probably need to be cited, but are not in the critical path. This isn't too hard either - things like Mendeley allow me to save (and tag) papers for later use with just a few clicks. It's not a perfect system (what if the paper's on someone's home page), but it mostly works.
Then come the more complicated categories:
  • Papers related to a problem that I'm not quite working on right now, but seem relevant. I can sock them away, but I have to remember to read them when I return to the other problem
  • Papers that everyone's talking about at the latest-greatest conference, but might have nothing to do with my specific suite of problems (like for example the Moser LLL proof).
  • Papers that might help me catch up on an area that is very hot, but which I wasn't following from the beginning (cough cough AGT cough cough)
  • Papers that were just announced in the latest conference/arxiv/eccc issue, that sound worthy of perusal.
There are many technological solutions to squirrel stuff away: I even use Dave Bacon's arxiview app for the iPhone (and btw, I've found the only use for 2-column format - reading papers on the iphone). I've also experimented with private wordpress installations to allow me to bookmark interesting papers.

But the real problem, that I have yet to crack, is how to systematically plow through the mountains of reading that I must do to stay abreast of the fields I'm interested in. I've tried "read one paper a day", or "read papers over the weekend" and things like that, but nothing ever seems to stick, and I'm curious about what techniques people have used that actually work. I'm not excluding technological solutions, but I think the problem goes beyond that.

So what say you all ?


  1. I think the answer's pretty simple, really: Read fewer papers!

    Trying to read every paper that you absolutely must read to keep up with three (four? five? six?) different fields is an impossible task; you'll be reading papers 40 hours a day. But the time you spend teaching, writing, thinking, sleeping, eating, playing, and all that other non-reading stuff is at least as valuable.

    (Nice to meet you, Mr. Pot; my name is Mr. Kettle.)

    I feel the need to ... keep abreast of things and share the latest news with my students.

    You might try reversing that relationship: Get each of your students to keep abreast of one or two areas, and then share the latest news with you (and each other).

  2. something I've experimented with, though not frequently, is to pipe things I need to read to a voice synthesizer and make mp3s that I can listen to while driving/exercising/doing chores/etc. If you have a mobile device that does voice synthesis, it may be even easier

    of course, I don't know how feasible of an idea this is for math notation, LaTeX, or PDFs in general, but for some stuff it works fantastically. I used to do this with some of Rajeev's lecture notes he posts to catch up on lectures I missed.

    i read an article saying t.v. raman does this, but more interestingly, has his voice synthesizer sped up, so he can go through text faster. it took him a little bit to get used to it i guess, but i guess the brain adapts.

  3. Not universally applicable, but my solution is to not read papers :). There are 2-year old papers in the area closest to me (data structure lower bounds) that I haven't read yet. I'm sure one day I'll get a flash understanding how the proof goes, while thinking about an entirely different topic... That's also more fun, no? :)

  4. One solution that works well for me is to organize a reading group. That way, you get to have papers that you wanted to read explained to you over the span of an hour. If you can involve enough people, it is not that much work for any one person.

  5. Was not aware of Mendeley before. How does it compare to zotero and/or CiteULike?

  6. You should be able to manually download a paper and then upload it to Mendeley (in cases where the paper is available on someone's homepage, etc).

  7. There's a good comparison chart of reference managers at Nature Network.

    To address the larger question, yes, it's not possible to keep up with all the literature in your field. You can either do what some people do, which is read the first paper that comes up in a search when they find they have time to read (not recommended), or only read your friends papers (also not recommended), or you can use your social network to filter out the really important papers. Tools like Mendeley help you find what other people are reading, and this data is then used to feed a recommendation algo to suggest things for you to read.

    If anyone has issues with Mendeley (or other ref managers, I've used them all & now work for Mendeley) please let me know.


Disqus for The Geomblog