“It isn’t that Liberals are ignorant. It’s just that they know so much that isn’t so.”
-- Ronald Reagan
I do not mean to pick on Liberals in this post but when the Obama administration came into office a great deal was made, by Liberals, about how decisions would now be based on science rather than politics. Further, although both Republicans and Democrats have used shaky science to support their political positions, it seems that many of the most important public policy proposals (to combat global warming, for example) and political ad hominem attacks (Conservatives are more rigid than Liberals, for example) have been heavily promoted by Liberals and their colleagues in the media as based on science. Whether we are discussing Democrats or Republicans, Liberals or Conservatives, it is worth recognizing when science is being mis-used and understanding the limitations of certain types of scientific inquiry.
A recent study in submission for publication by Eric-Jan Wagenmakers, Ruud Wetzels, Denny Borsboom, & Han van der Maas at the University of Amsterdam (available on Ruud Wetzels' web site) raises questions about how Psychologists analyze their data. They looked specifically at recent studies purporting to show evidence for Psi, specifically, precognition. Here is the abstract: [HT: Next Big Future]
Why Psychologists Must Change the Way They Analyze Their Data: The Case of Psi
Does psi exist? In a recent article, Dr. Bem conducted nine studies with over a thousand participants in an attempt to demonstrate that future events retroactively affect people’s responses. Here we discuss several limitations of Bem’s experiments on psi; in particular, we show that the data analysis was partly exploratory, and that one-sided p-values may overstate the statistical evidence against the null hypothesis. We reanalyze Bem’s data using a default Bayesian t-test and show that the evidence for psi is weak to nonexistent. We argue that in order to convince a skeptical audience of a controversial claim, one needs to conduct strictly confirmatory studies and analyze the results with statistical tests that are conservative rather than liberal. We conclude that Bem’s p-values do not indicate evidence in favor of precognition; instead, they indicate that experimental psychologists need to change the way they conduct their experiments and analyze their data.
You don't need to know much about statistics to understand what the objections are to the way Bem's experimental results were evaluated and how this might be an issue in other reports on scientific research results. Here are the key points:
The most important flaws in the Bem experiments, discussed below in detail, are the following: (1) confusion between exploratory and confirmatory studies, brought about by what we have termed the Bem Exploration Method (BEM); (2) insufficient attention to the fact that the probability of the data given the hypothesis does not equal the probability of the hypothesis given the data (i.e., the fallacy of the transposed conditional); (3) application of a test that overstates the evidence against the null hypothesis, an unfortunate tendency that is exacerbated as the number of participants grows large.
What does it mean? As you read this, consider as a couple of examples, the claims of the proponents of Anthropogenic Climate Change (née Anthropogenic Global Warming) or the "Science" of Intelligent Design. They cherry pick their data and use their science to confirm preconceived ideas of causality; they then use those conclusions to propose extreme measures which would, coincidentally I'm sure, increase their own power, influence, and wealth.
Consider the first point, "confusion between exploratory and confirmatory studies". What this means is that the researchers collected data and then looked for data that confirmed their pre-existing bias; not only did they neglect alternative explanations of the data, they didn't even look for such explanations:
These explorative elements are clear from Bem’s discussion of the empirical data. The problem with Bem’s BEM runs deeper, however, because we simply do not know how many other factors were taken into consideration only to come up short. We can never know how many other hypotheses were in fact tested and discarded; some indication is given above and in Bem’s section "The File Drawer". At any rate, the foregoing suggests that strict confirmatory experiments were not conducted.
What is the fallacy of the transposed conditional?
The interpretation of statistical significance tests is liable to a misconception known as the fallacy of the transposed conditional. In this fallacy, the probability of the data given a hypothesis (e.g., p(DjH), such as the probability of someone being dead given that they were lynched, a probability that is close to 1) is confused with the probability of the hypothesis given the data (e.g., PHjD), such as the probability that someone was lynched given that they are dead, a probability that is close to zero). This distinction provides the mathematical basis for Laplace’s Principle that extraordinary claims require extraordinary evidence. This principle holds that even compelling data may not make a rational agent believe that goldfish can talk, that the earth will perish in 2012, and that psi exists (see also Price, 1955). Thus, the prior probability attached to a given hypothesis affects the strength of evidence required to make a rational agent change his or her mind.
The third comment is a little more technical and has to do with the types of statistical tests used and the need to evaluate the data using multiple statistical approaches in order to assure the conclusions are warranted by the data:
Consider a data set for which p = >.001, indicating a low probability of encountering a test statistic that is at least as extreme as the one that was actually observed, given that the null hypothesis H0 is true. Should we proceed to reject H0? Well, this depends at least in part on how likely the data are under H1. Suppose, for instance, that H1 represents a very small effect—then it may be that the observed value of the test statistic is almost as unlikely under H0 as under H1. What is going on here?
The underlying problem is that evidence is a relative concept, and it is not insightful to consider the probability of the data under just a single hypothesis. For instance, if you win the state lottery you might be accused of cheating; after all, the probability of winning the state lottery is rather small. This may be true, but this low probability in itself does not constitute evidence—the evidence is assessed only when this low probability is pitted against the much lower probability that you could somehow have obtained the winning number by acquiring advance knowledge on how to buy the winning ticket.
Even the most well meaning Scientists can all too easily fool themselves. Using statistics and the most carefully designed studies, they are still fully capable of "discovering" nonsense. One might think that after so many failures of expert opinion in the last few hundred (thousand?) years that some modesty would be forthcoming from those who believe they have the answers. Alas, there is no indication that those who desire to control how other people live have recognized, let alone assimilated, the natural boundaries of certainty.
Recent Comments