In article <8uc404$2dr$1 at mercury.hgmp.mrc.ac.uk>, Guy Hoelzer
<hoelzer at unr.edu> wrote:
> In article <8uc0ke$jme$1 at mercury.hgmp.mrc.ac.uk>, Anders Gorm Pedersen
> <gorm at cbs.dtu.dk> wrote:
>> > Mich Ard wrote:
> >
> > > I'm currently making a phylogenetic tree
> > > based on the protein sequences, using the
> > > programs of clustalW and phylip, and found
> > > the output is "unstable", it's always changing
> > > depending on the inputing order of these
> > > sequences;
> >
> > you should be aware that the apparent instability may in fact merely be
> > different graphical representations of the exact same tree. Remember
that even
> > if you rotate a branch then it's
> > still the same tree topology.
> >
> > I just experimented with using the differently ordered versions of one
> distance
> > matrix as input to neighbor (of the phylip package). This results in
> trees that
> > are topologically identical, but that are rendered differently by
> drawtree (also
> > phylip package). This is mainly due to a different orientation of the entire
> > tree, and also due to some branches being rotated.
>> Anders makes a good point, but it is still possible (even probable) that
> the topologies you discovered are truly different. Every heuristic
> algorithm is inherently entry-order sensitive. If you were using the
> maximum parsimony algorithm, and did an exhaustive search, then you would
> always get the same tree no matter what your entry-order was. If you used
> a heuristic search algorithm, the order would matter. Neighbor joining is
> a heuristic algorithm, and is therefore always prone to entry-order
> sensitivity. NJ results should always be checked by trying a variety of
> entry orders. The same is true of maximum likelihood analyses. It is
> almost always too computationally intensive to do exhaustive searches in
> the context of ML analyses, so heuristic searches are almost always used.
> This makes the result of ML analyses prone to entry-order sensitivity. As
> your question indicates, you are correct to be concerned about this. It
> certainly suggests uncertainty about which tree actually optimizes the
> tree picking criterion you chose when you decided which tree estimation
> algorithm to use.
And if you read carefully, Guy just gave the way to tell if you are
getting the same tree or different trees: if they are all equally
parsimonious, equally likely, or an equally good fit (for parsimony, ML,
and FITCH respectively), then you have the same tree just drawn in various
ways. If the trees all have different parsimony scores etc., you have a
true input order effect, and should run multiple times with different
random orderings, choosing the tree with the best score. (This doesn't
work for neighbor joining, which has no optimality criterion. Maybe
neighbor joining isn't the best idea anyway. A neighbor joining tree
should always be tested by running it through a least-squares best-fit
algorithm.)
--
*Note the obvious spam-defeating modification
to my address if you reply by email.