Skip to content

Data Sharing and the Research Parasites

Since I cultivate an interest in all the weaknesses today's scientific process one thing I often ask myself is why the scientific community isn't fixing things faster. There are many obvious improvements that are ignored by many scientists, problems that are not tackled, and sometimes there is outright resistance against changes that should hardly be controversial.

A recent editorial in the New England Journal of Medicine (NEJM) by Dan Longo and Jeffrey Drazen (who is the editor of the NEJM), is a very extreme example of this. It is pretty much a vocal defense of bad research practice.

In most cases data sharing should be a no-brainer. Scientific studies that rely on raw data should not only publish their results, they should, whenever possible, also make their raw data available. There are some instances where data sharing can be problematic and needs to be done carefully, for example if privacy issues are involved, but these aren't the concerns that the writers of this NEJM editorial have. Data sharing is valuable for two reasons: First of all it allows other people to check and potentially criticize scientific results. And second it allows others to find additional research results that may be hidden in a raw data set.

Longo and Drazen recognize the second issue and propose that data should be shared, but only in collaboration and with co-authorship of the people who created the original data set. There is of course nothing wrong with collaborating with the original authors if it makes sense, but it completely fails to address the first and foremost reason for data sharing: To allow others to reinvestigate and criticize scientific results.

In fact Longo and Drazen see a problem in this. They list it as a concern against data sharing that others may “use the data to try to disprove what the original investigators had posited.” The implicit assumption in this statement is that the original authors of a study are always right and if someone else reinterprets their result they are automatically wrong – which is of course total nonsense.

The editorial goes on with concerns that data sharing may create “research parasites”. I find it hard to understand what that should actually mean. Taking someone else's data and using it to create new scientific results seems like a good thing. Especially in medicine – remember this editorial was published in one of the leading medical journals – it is almost an ethical obligation to use existing data as much as possible to foster scientific progress.

Longo and Drazen say that they propose a symbiotic instead of a parasitic use of data sharing. As said above, this is fine in many situations, but imagine this: If a data set is supportive of a scientific result that goes against the theories and beliefs of the people who collected that data – should that new scientific result stay hidden? I don't think so.

The editorial has caused a bit of an outrage and on Twitter the hashtag #researchparasites gained some popularity. In a certain sense I see this editorial as an opportunity. It's rare to have such an honest account of why some people reject improvements in science: To shield themselves from criticism and scientific rigor. But it's shocking that one of the leading medical journals is supporting that.

Also worth reading: Translation to plain English of selected portions of Longo and Drazen's editorial on data sharing (Jonathan Peelle)


No Trackbacks


Display comments as Linear | Threaded

No comments

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.

Form options