psi, the psi researcher might want to consider more drastic measures to ensure that the experiment was completely confirmatory:

  • 5.

    The psi researcher may make stimulus materials, computer code, and raw data files publicly available online. The psi-researcher may also make the decisions made with respect to guidelines 1-4 publicly available online, and do so before the confirmatory experiment is carried out.

  • 6.

    The psi researcher may engage in an adversarial collaboration, that is, a collaboration with a true skeptic, and preferably more than one (Price, 1955; Wiseman & Schlitz,

    • 1997)

      . This echoes the advice of Diaconis (1991, p. 386), who stated that the studies

on psi reviewed by (Utts, 1991) were “crucially flawed (...) Since the field has so far failed to produce a replicable phenomena, it seems to me that any trial that asks us to take its findings seriously should include full participation by qualified skeptics.”

The psi researcher who also follows the last two guidelines makes an effort that is slightly higher than usual; we believe this is a small price to pay for a large increase in credibility. It should after all be straightforward to document the intended analyses, and in most universities a qualified skeptic is sitting in the office next door.

Concluding Comment

In eight out of nine studies, Bem reported evidence in favor of precognition. As we have argued above, this evidence may well be illusory; in several experiments it is evident that exploration should have resulted in a correction of the statistical results. Also, we have provided an alternative, Bayesian reanalysis of Bem’s experiments; this alternative analysis demonstrated that the statistical evidence was, if anything, slightly in favor of the null hypothesis. One can argue about the relative merits of classical t-tests versus Bayesian t-tests, but this is not our goal; instead, we want to point out that the two tests yield very different conclusions, something that casts doubt on the conclusiveness of the statistical findings.

In this article, we have assessed the evidential impact of Bem’s experiments in isola- tion. It is certainly possible to combine the information across experiments, for instance by means of a meta-analysis (Storm, Tressoldi, & Di Risio, 2010; Utts, 1991). We are ambiva- lent about the merits of meta-analyses in the context of psi: one may obtain a significant result by combining the data from many experiments, but this may simply reflect the fact that some proportion of these experiments suffer from experimenter bias and excess explo- ration. When examining different answers to criticism against research on psi, Price (1955, p. 367) concluded “But the only answer that will impress me is an adequate experiment. Not 1000 experiments with 10 million trials and by 100 separate investigators giving total odds against change of 101000 to 1—but just one good experiment.”

Although the Bem experiments themselves do not provide evidence for precognition, they do suggest that our academic standards of evidence may currently be set at a level that is too low (see also Wetzels et al., to appear). It is easy to blame Bem for presenting results that were obtained in part by exploration; it is also easy to blame Bem for possibly o v e r e s t i m a t i n g t h e e v i d e n c e i n f a v o r o f H 1 b e c a u s e h e u s e d p - v a l u e s i n s t e a d o f a t e s t t h a t

