Testing Models against a tree (help)

Joe Felsenstein joe at evolution.genetics.washington.edu
Tue Jul 25 23:39:50 EST 1995

In article <1995Jul25.213229.40081 at ac.dal.ca>,  <aroger at ac.dal.ca> wrote:
>In article <3umno4$fb9 at nntp3.u.washington.edu>, joe at evolution.genetics.washington.edu (Joe Felsenstein) writes:
[in response to my assertions that model B will always do at least as well]
>Is it really inevitable if I fix the ancestral states (0 for all
>characters for model B, whereas 1 is fixed for all characters
>under model A)?  I can see your point if there were no conditions
>imposed upon the ancestral states.

You are correct.  I was assuming you allowed the method to choose whichever
state it wanted as the ancestral state.  I am not sure why you want to
use these two choices, but yes, you're right it is not obvious which will
be better under those circumstances.

>Is the Templeton paired sites test the same as the widely discussed
>"Templeton test" of whether one tree is significantly better than another
>given the data?  If not could you suggest a reference that I might
>look to?

Templeton's original test is in Evolution in 1983.  It was rather
complicated, dividing restriction sites up by enzyme and then doing a
Wilcoxon ranked sums test.  Basically one pairs up the sites in the
two trees, and does some kind of paired nonparametric or parametric
test of whether one is on average bigger than the other.

Many people have done different ones.  Allan Wilson suggested a simple
sign test (evaluated a bit in a paper in Syst. Zool. by me in 1986).
In my program package I do a simple "z test" assuming normality, which
is too much of a simplification.  The best suggestion is one I made to
Kishino and Hasegawa, that they bootstrap from among the pairs to get
the bootstrap distribution of the sum of differences.

>I have another test in mind.  Perhaps someone could tell me
>if it is sound.
>Under model A I could construct a tree from the data and measure the
>tree to tree distance from the "known" tree of the organisms.  I could
>then construct a tree under model B from the data and derive the
>tree to tree distance of this tree compared to the known tree.  If
>model B yields a tree closer to the true tree (smaller tree to tree
>distance) then I suggest that this model is more realistic.  The
>only problem I can see is whether the tree under model B is 
>significantly closer to the true tree than the tree under model A--
>ie I need a statistic attached to the tree to tree distance measures.
>Does this sound reasonable?

I defer to Mary Kuhner who has already answered this, but basically
just looking to see which is closer is fine but not statistical.  You
can say "A looks closer" but you can't say "A looks significantly closer"
unless you do some statistics, such as the bootstrap test she suggests.

Joe Felsenstein         joe at genetics.washington.edu     (IP No.
 Dept. of Genetics, Univ. of Washington, Box 357360, Seattle, WA 98195-7360

More information about the Mol-evol mailing list