bottstrapping evolutionary trees
joe at evolution.genetics.washington.edu
Sat May 13 01:43:57 EST 1995
In article <D841DF.8s2 at info.swan.ac.uk>,
David Skibinski <D.O.F.Skibinski at Swansea.AC.UK> wrote:
>With three species, A, B, and C there
>are three rooted trees I (AB) C II (AC) B and III A (BC). My a priori
>expectation is that these trees can be equally probable,
>thus p=0.33 for tree I. I get some data, make a tree and do
>bootstrapping and find that tree I is supported in a
>proportion of 0.99 of a large sample of bootstrap reps.
>My question is, can I use a likelihood ratio test to compare
>0.99 with 0.33 ? If so what is the test ? Is it
>G = 2ln((1-0.33)/(1-0.99) with 1 (or 2 df) or something else ?
It would be an exact binomial test, computing the probability of
getting 0.99 for one outcome when all three are in fact equiprobable.
And on would have to correct by a factor of 3 for the fact that there
were three trees that each could have been accidentally supported by
this much. BUT ... this smacks of tests on tests on tests on ... and I
would be wary.
>Then if I get two other data sets and find bootstrap probabilities
>of 0.95 and 0.75 for tree I, how can these be combined with 0.33
>in a single test, and what would be the degrees of freedom ? I
>am puzzled by the fact that in the `combining probabilities test`
>(as described in Biometry by Sokal and Rohlf) which appears to be a
>likelihood ratio test in which the alternative hypotheses are p=1,
>the df = 2k where k is number of probabilities. Yet in my example
>there are only three data sets, so should sum to 3 df.
Don't. You would probably be better off taking the bootstrapped
data sets from each sampling, appending them end-to-end as if the
sequences were as long as the sum of their lengths, and running
that as a set of bootstrap samples.
Joe Felsenstein joe at genetics.washington.edu (IP No. 18.104.22.168)
Dept. of Genetics, Univ. of Washington, Box 357360, Seattle, WA 98195-7360
More information about the Mol-evol