Skip to content

How not to do replications

Recently there was an article on the Science webpage about replication problems in artificial intelligence research. The article mainly highlights the fact that many studies in the field fail to provide supplemental code and data. But it also mentions an example how replications can go wrong.

They mention the project and journal ReScience. It describes itself as "a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research is reproducible."

Replicating studies is generally considered a good thing and a step to improve science and highlight problems. Yet the Science article mentions that up until now all replication attempts published at ReScience have been positive. This is highly implausible, but a possible explanation is provided: Scientists don't want to criticize their peers, therefore failed replications aren't published, particularly because they're often done by young researchers that don't feel confident criticizing their senior peers.

If this is true we have a pretty dire situation, and one that isn't helpful at all: People try to replicate other people's work, but they'll only publish it if it's positive. One could very well argue that this makes things worse, not better, as it increases publication bias.

This very problem has been highlighted by a group of psychology researchers in 2015 in a paper titled The Replication Paradox. There's a good summary at Retraction Watch.

While they name it a paradox the effect is actually not so surprising. If you replicate studies but you have publication bias, meaning you only publish successful replications and not failed ones, you may end up creating the impression that an effect is even stronger than the original potentially flawed research indicated. The public scientific record gets worse, not better.

This shows that it is crucial that replication efforts also make sure they counter publication bias. A way of doing this is study preregistration, meaning that one registers the intent to do a study before actually doing any data collection or experiments in a public register. Other replication efforts like the Open Science Foundation's Reproducibility Project: Psychology included preregistration by default.