Thursday, December 09, 2004

Differing standards for sharing experimental support structures

I've been writing some simple code to help my wife with some data analysis (she's a 'wetwork' biologist). According to her, if she writes a paper and uses analysis based on the code I write, I am required to supply the code to anyone who might request it. Of course, I could charge money for it: the commercial aspect is not really the point. The crucial point is that there is a common understanding in her field that all data used in the course of a paper must be made available on demand, no matter how long it took to make it (like reagents) or how precious the resource is.

Of course, in practice people have ways of getting around this. But the significant aspect of this understanding is that it is normative; people are expected to follow this baseline behaviour model, and deviation is viewed as inappropriate.

Now in experimental areas in computer science I find that we are far from such a baseline expectation. It seems to me that someone is doing me a favor if they provide me with the code they used in their paper, and even conferences that explicitly focus on experimental work (as opposed to conferences that deal with applied topics) don't expect authors to provide code.

There are logistical difficulties with releasing code if you work at a research lab like I do, but for researchers working in academia, this is clearly not a problem. So is this lower level of expectation appropriate for our community: is the analogy with biology (and probably other experimental sciences) inapt ? Or are we really slacking off on our scientific responsibilities ?

I should add that I often hide behind my corporate wall: I don't have publicly available code either, and haven't really tried to push the myriad pieces of paper that go into getting approval for a release. The point really is the different baseline that we have; people adapt their behaviour appropriately.

No comments:

Post a Comment

Disqus for The Geomblog