1

99

2.51

0.01

0.61

1 Anecdotal (H )

2

149

2.39

0.009

0.95

1 Anecdotal (H )

3

96

2.55

0.006

0.55

1 Anecdotal (H )

4

98

2.03

0.023

1.71

0 Anecdotal (H )

5

99

2.23

0.014

1.14

0 Anecdotal (H )

6

149

1.80

0.037

3.14

0 Substantial (H )

6

149

1.74

0.041

3.49

0 Substantial (H )

7

199

1.31

0.096

7.61

0 Substantial (H )

8

99

1.92

0.029

2.11

0 Anecdotal (H )

9

49

2.96

0.002

0.17

1 Substantial (H )

NO EVIDENCE FOR PSI

9

for the experiments reported in Bem (in press), reanalyzed

Exp

Table 2: The results of 10 using the default Bayesian

crucial tests t-test.

df

|t|

p

B F 0 1

Evidence category ( i n f a v o r o f H . )

Using the Bayesian t-test web applet provided by Dr. Rouder^{6 }it is straightforward to compute the Bayes factor for the Bem experiments: all that is needed is the t-value and the degrees of freedom (Rouder et al., 2009). Table 2 shows the results. Out of the 10 c r i t i c a l t e s t s , o n l y o n e y i e l d s “ s u b s t a n t i a l ” e v i d e n c e f o r H 1 , w h e r e a s t h r e e y i e l d “ s u b s t a n t i a l ” e v i d e n c e i n f a v o r o f H 0 . T h e r e s u l t s o f t h e r e m a i n i n g s i x t e s t s p r o v i d e e v i d e n c e t h a t i s o n “anecdotal” or “worth no more than a bare mention” (Jeffreys, 1961). l y

In sum, a default Bayesian test confirms the intuition that, for large sample sizes, one-sided p-values higher than .01 are not compelling (see also Wetzels et al., to appear^{7}). Overall, the Bayesian t-test indicates that the data of Bem do not support the hypothesis of precognition. This is despite the fact that multiple hypotheses were tested, something that warrants a correction (for a Bayesian correction see Scott & Berger, 2010; Stephens & Balding, 2009).

Note that, even though our analysis is Bayesian, we did not select priors to obtain a desired result: the Bayes factors that were calculated are independent of the prior model odds, and depend only on the prior distribution for effect size—for this distribution, we used the default option. We also examined other options, however, and found that our conclusions are robust: for a wide range of different, non-default prior distributions on effect size the evidence for precognition is either non-existent or negligible.^{8 }

At this point, one may wonder whether it is feasible to use the Bayesian t-test and eventually obtain enough evidence against the null hypothesis to overcome the prior skepti- cism outlined in the previous section. Indeed, this is feasible: based on the mean and sample standard deviations reported in Bem’s Experiment 1, it is straightforward to calculate that a r o u n d 2 0 0 0 p a r t i c i p a n t s a r e s u ffi c i e n t t o g e n e r a t e a n e x t r e m e l y h i g h B a y e s f a c t o r B F of about 10^{−24}; when this extreme evidence is combined with the skeptical prior, the end 0 1

6 7 8 See http://pcl.missouri.edu/bayesfactor. A preprint is available at http://www.ruudwetzels.com/. This robustness analysis is reported in an online appendix available on the first author’s website, http: //www.ejwagenmakers.com/papers.html.