Making alignments
Andrew Rambaut
andrew.rambaut at zoology.oxford.ac.uk
Fri Jan 23 11:09:03 EST 1998
>This seems to me to be an overly optimistic view of the value of maximum
>likelihood approaches. In general, maximum likelihood is vulnerable to
>errors in the chosen model of evolution and in the parameter estimates
>that are used. Granted, there are ways to test which of a limited set of
>models best explains the data and parameter values can be estimated (with
>error) from the data. However, it is a certainty that the perfectly
>"correct" correct model and parameter values will never be tested. The
>magnitude of the resulting errors in phylogeny estimation remain unknown
>in particular cases. While it is possible that a thorough ML analysis
>might usually yield a tree that is close to the correct one, I worry that
>overly optimistic presentations of the power of ML will lead those less
>familiar than Dr. Goldman with the limitations of ML to develop too much
>confidence in those results.
Take a tree like this (a classic MP Felsenstein Zone case, I believe):
tip1 \
\
\
\ tip2
\ /

/ \
/ tip3
/
/
/
tip4
then I set the long branches to be 1.0 subst/site and the short branch
lengths
(including the internal one) to be 0.1 subst/site. I also created another
tree
where the long branches are 5.0 subst/site (fairly well saturated). I
generate
sequences using my program SeqGen (HKY model ts/tv 2.0 equal base
freqs). For
each tree I generated 2 data sets with 500 bp and 5000bp, 100 times each.
I then checked the likelihood of each of the tree topologies and ran a
Kashino
Hasegawa test on them. The results were:
Tree 1 (long branches=1.0, short=0.1)
500 bp: 58 correct trees (0 significantly right), 42 wrong (0
significantly wrong)
5000 bp: 98 correct trees (47 significantly right), 2 wrong (0
significantly wrong)
Tree 2 (long branches=5.0, short=0.1)
500 bp: 56 correct trees (0 significantly right), 44 wrong (0
significantly wrong)
5000 bp: 61 correct trees (0 significantly right), 39 wrong (0
significantly wrong)
By significantly right, I mean that the wrong trees are rejected in
favour of the right one
By significantly wrong, I mean that the right tree is rejected in favour
of one the wrong
These are just some quick and dirty results  so don't quote me on them.
Goes to show that saturation causes uncertainty in ML phylogenetic
estimation. Although
you can get wrong answers, you don't get significantly wrong ones  which
is what we
want in a phylogenetic method. Of course these simulations assume a model
of substitution
that we know to be correct (I simulated it that way). My point really is
that if you
assume a model and the assumptions are wrong, you will get invalid
results.
The benefit of ML is that the model is explicit and the assumptions
testable
(see Goldman, 1993 for example  actually both of his 1993 papers are
good).
Andrew
===================================================================
Andrew Rambaut, EMAIL  andrew.rambaut at zoo.ox.ac.uk
Zoology Department, WWW  http://evolve.zoo.ox.ac.uk/
University of Oxford, TEL  +44 1865 271272
South Parks Road, Oxford, UK FAX  +44 1865 271249
===================================================================
More information about the Molevol
mailing list