Friday, February 12, 2010

Papers and SVN

Way back when, I had promised to do a brief post on the use of SVN (or other versioning systems) for paper writing. Writing this post reminds of all the times I've sniggered at mathematicians unaware of (or just discovering) BibTeX: I suspect all my more 'systemsy' friends are going to snigger at me for this.

For those of you not familiar with the (cvs, svn, git, ...) family of software, these are versioning systems that (generally speaking) maintain a repository of your files, allow you to check files out, make local changes, and check them back in, simultaneously with others who might be editing other files in the same directory, or even the same file itself.

This is perfect come paper writing time. Rather than passing around tokens, or copies of tex files, (or worse, zip files containing images etc), you just check the relevant files into a repository and your collaborator(s) can check them out at leisure. SVN is particularly good at merging files and identifying conflicts, making it easy to fix things.

My setup for SVN works like this: Each research project has a directory containing four subdirectories. Two are easy to explain: one is a "trunk" directory where all the draft documents go, and another is an "unversioned" directory for storing all relevant papers (I keep these separate so that when you're checking out the trunk, you don't need to keep downloading the papers that get added in)

The other two come in handy for maintaining multiple versions of the current paper. The 'branches' directory is what I use when it comes close to submission deadline time, and the only changes that need to be made are format-specific, or relate to shrinking text etc. The 'tags' directory is a place to store frozen versions of a paper (i.e post-submission, post-final version, arxiv-version, journal version, etc etc)

It seems complicated, but it works quite well. The basic workflow near deadline time is simply "check out trunk; make changes, check in trunk; repeat...". A couple of things make the process even smoother:
  • Providing detailed log messages when checking in a version: helps to record what exactly has changed from version to version - helpful when a collaborator needs to know what edits were made.
  • Configuring SVN to send email to the participants in a project whenever changes are committed. Apart from the subtle social engineering ("Oh no ! they're editing, I need to work on the paper as well now!"), it helps keep everyone in sync, so you know when updates have been made, and who made them.
  • Having a separate directory containing all the relevant tex style files. Makes it easy to add styles, conference specific class files etc.
I can't imagine going back to the old ways now that I have SVN. It's made writing papers with others tremendously streamlined.

  • SVN isn't ideal for collaborations across institutions. Much of my current work is with local folks, so this isn't a big problem, but it can be. Versioning software like git works better for distributed sharing, from what I understand.

Disqus for The Geomblog