SCIENTIFIC REGRESS by William A. Wilson
The problem with science is that so much of
it simply isn’t. Last summer, the Open Science Collaboration announced that it
had tried to replicate one hundred published psychology experiments sampled
from three of the most prestigious journals in the field. Scientific claims
rest on the idea that experiments repeated under nearly identical conditions
ought to yield approximately the same results, but until very recently, very
few had bothered to check in a systematic way whether this was actually the
case.
The OSC was the biggest attempt yet to check a field’s results, and the most shocking. In many cases, they had used original experimental materials, and sometimes even performed the experiments under the guidance of the original researchers. Of the studies that had originally reported positive results, an astonishing 65 percent failed to show statistical significance on replication, and many of the remainder showed greatly reduced effect sizes.
The OSC was the biggest attempt yet to check a field’s results, and the most shocking. In many cases, they had used original experimental materials, and sometimes even performed the experiments under the guidance of the original researchers. Of the studies that had originally reported positive results, an astonishing 65 percent failed to show statistical significance on replication, and many of the remainder showed greatly reduced effect sizes.
Their findings made the news, and quickly
became a club with which to bash the social sciences. But the problem isn’t
just with psychology. There’s an unspoken rule in the pharmaceutical industry
that half of all academic biomedical research will ultimately prove false, and
in 2011 a group of researchers at Bayer decided to test it. Looking at
sixty-seven recent drug discovery projects based on preclinical cancer biology
research, they found that in more than 75 percent of cases the published data
did not match up with their in-house attempts to replicate.
These were not
studies published in fly-by-night oncology journals, but blockbuster research
featured in Science, Nature, Cell, and the like. The Bayer researchers were
drowning in bad studies, and it was to this, in part, that they attributed the
mysteriously declining yields of drug pipelines. Perhaps so many of these new
drugs fail to have an effect because the basic research on which their
development was based isn’t valid.
When a study fails to replicate, there are
two possible interpretations. The first is that, unbeknownst to the
investigators, there was a real difference in experimental setup between the
original investigation and the failed replication. These are colloquially
referred to as “wallpaper effects,” the joke being that the experiment was
affected by the color of the wallpaper in the room. This is the happiest
possible explanation for failure to reproduce: It means that both experiments
have revealed facts about the universe, and we now have the opportunity to
learn what the difference was between them and to incorporate a new and subtler
distinction into our theories.
The other interpretation is that the
original finding was false. Unfortunately, an ingenious statistical argument
shows that this second interpretation is far more likely… read more: