KH test and SH test

Kerstin Hoef-Emden kerstin.hoef-emden at
Thu May 17 16:15:45 EST 2001

Hi to all,

I am at the moment busy in learning about the proper use of Kishino-Hasegaw=
tests and Shimodaira-Hasegawa tests. Until now it has been standard in our
working group to use KH tests on treefiles consisting of the maximum
likelihood tree and a posteriori user-defined trees. But now the SH test ha=
been implemented into PAUP and we want to get an idea how to use it

The results for each comparison of a user-defined tree with the best tree i=
the KH tests are independ of the number of trees in the treefile or of the
topologies of the other trees, because each tree is always tested against
the best tree only. The SH test is designed to behave differently. As far a=
I understand correctly, the results take into account the number of trees i=
the treefile and as well their topology to generate a distribution to infer
the significance limits. The resulting "bubble gum" behaviour of the test
troubles us.

We are a phycology group and work predominantly with organisms without
fossil record. So we are often confronted with groups of organisms whithout
or with poor auxiliary information about the history or value of their
morphological characters.

1.) The KH test was designed to compare two a priori defined trees with eac=
other. It is said that using it for comparing the maximum likelihood tree
with a user-defined tree is wrong. First of all because when the best tree
is involved, a one-tailed approach has to be used. Since no better trees ca=
be expected, this argument is understandable to me. But it is also told to
be wrong to include a posteriori defined trees in the test. This raises
question which may be a little bit provocative:

What is the practial value of comparing one tree based solely on
speculations with another tree based as well on speculations about the
evolutionary history of the group in question? Most probably both trees are
wrong, perhaps miles apart from the maximum likelihood, maximum parsimony o=
distance tree.

2.) The dependability of the SH test from the treefile fed in, led to
discussions in our group. We don=B4t feel very certain when using it. The
user-defined trees in the tests are usually minor modifications of the
maximum likelihood tree. We don=B4t expect to find a better tree, we want t=
know "how much worse" a tree is, when one clade or taxon of specific
interest is transferred to this or that position. Usually a certain
correlation between bootstrap support and these transfer experiments can be
observed. Sometimes we compare the results of tests performed under
different conditions (altered taxon sampling, deleted outgroup, etc.) or
different gene phylogenies with each other. This seems to be impossible wit=
the non-independend results of the SH test, since the treefiles have to be
different e.g. when two SH tests are performed on a phylogeny with one
complete and one dataset with one or more clades deleted.

Quite possibly KHT and SHT are wrong for what we intend and perhaps some
statisticians already suffer from a heart attack when reading what we do to
their tests. What are your suggestions/comments on this item?=20

Thanks in advance,


Kerstin Hoef-Emden


=09kerstin.hoef-emden at


More information about the Mol-evol mailing list