Some myths concerning statistical hypothesis testing

Sturla Molden sturla at molden_dot_net.invalid
Thu Nov 7 05:48:05 EST 2002

```Glen M. Sizemore wrote:

> 1.) Tests of statistical significance do not provide a quantitative
> estimate of the reliability of the result.

They don't. Do neuroscientists generally believe this?

> 2.) Tests of statistical significance do not estimate the probability that
> the results were due to chance.

They estimate the probability of getting at least as
impressive results by fluke.

> 3.) Tests of statistical significance usually do not answer a question to
> which the answer is unknown.

Also true.

Statisticans with knowledge of Bayesian theorems have claimed
this since Pearson, Neuman and Fisher began publishing their
works in the early 20th century. This is not new. Unfortunately,
the majority of contemporery researchers didn't learn about Bayesian
statistics in their applied statitics class. And what actually
matter seems to be the probability of getting a paper accepted,
not the reliability of the result. And since this usually means
convincing a refree that knows nothing about statitics, one has
to speak a language the refree knows. And that means using
There are generally two penalties for using proper statitical
tests:

- "Classical" tests overestimates the true significance,
sometimes by an order of magnitude. This is good for getting
papers accepepted. And nobody get fried for using "classical"
statitics.

- Refrees don't know Bayesian statitics and will ask
for the flawed "classical tests" anyhow.

As long as science is about getting papers published and
researchers aren't educated about Bayesian statitics,
this will not change. Scientists generelally don't know
what the tests imply, and don't care either.

There is a way out of this dilemma, that anyone can
(and probably should) use:

There is a third school of statiticans, those that rely on
graphing data. The idea is that one should know enough
about plotting data, and ways to plot data, in order to
evaluate the results visually (this is usually possible).
If the effect doesn't show on a properly selected graph,
it is not interesting regardless of the "significance",
as the effect size must be negigible. But if the graph
convincing, classical statitics still doesn't matter.
Investigations have shown that visual evaluation is superior
to "tests of signficance" on properly graphed data sets,
and that the correspondance between visual judgement
and exact Bayesian tests are usually high.

There are almost only good aspects by this approach. It
evaluates the effect size as well as the "significance".
Publishers like papers with nice figures. One can still
include classical statistics to convince undereducated
refrees.

William Cleveland's books "Visualizing data" and "Elements
of Graphing Data" are good places to start.

Sturla Molden

```