IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Decay Indices - summary

Joe Felsenstein joe at evolution.genetics.washington.edu
Fri Apr 21 01:27:01 EST 1995

In article <patersoa.53.000BF4BA at lincoln.ac.nz>,
Paterson, Adrian Mark <patersoa at lincoln.ac.nz> wrote:
(an interesting summary of comments others sent him about decay
indices including these comments on bootstraps versus decay indices
(also called  Support Indices)by Doug Eernisse):

>SI values vs. Bootstrap values:
>A high SI generally corresponds to a high bootstrap value (with some
>infrequent exceptions)

This seems true but little is yet known about statistical properties of
SI.  Perhaps Doug would argue that statistical inference is not the right
framework for thinking about this anyway (I believe that this is Farris's
and Kluge's current position ... I think ... and maybe his too).  In fact,
I'd appreciate Doug pointing me to any articles by Farris or by Kluge
which say clearly whether they prefer to statistical or the
logical-parsimony or the hypothetico-deductive frameworks as the bases of
inferring phylogenies.  I must have missed the article where they said that.

> but:
>1. SI calculations are based on exactly the same data set as your
>parsimony search (i.e., they are based on all available evidence).

So are bootstrap values.

>2. SI calculations have a more direct and intuitive relationship to
>the "robustness" of nodes.

My intuition works differently, I guess.

>3. SI calculations do not confound statistical support for the reality
>of a particular tree with a separate issue of how likely the same tree
>would be obtained if you had more characters from the same universe
>of characters, whether or not it is the "true" tree.

I am not sure how SI achieves this.  This seems to be a statement that
SI is not misled by inconsistency problems the way bootstraps will be.  But
see assertion 5 below.

>4. Some of the assumptions of a bootstrap analysis are frequently violated,
>for example, it depends on the i.i.d. assumptions identified by Felsenstein
>and others, which require that the characters be identically and
>independently distributed. Sanderson has separated these assumptions into
>two less restrictive assumptions, namely, that characters are independent,
>and that the observed charcter set is a "representative" sample of the
>"universe of characters" (paraphrased from PAUP 3.1 manual, p. 56).

I would alter these assertions by saying that the characters are
independent, and that they are randomly sampled from A UNIVERSE of
characters rather than THE UNIVERSE.  There is no assumption that they
randomly sample all possible characters (if they did the method would be
primarily of interest to angels on heads of pins) but only that they
are drawn randomly from some large set of possible characters.  Thus two
studies, one using osteological characters, and one using behavioral ones,
could both (separately) use the bootstrap without the difference in the
universes from which they draw creating a problem.  Independence is the
more problematic assumption, actually.

>5. Whether or not SI values differ from bootstrap analyses in overcoming
>systematic biases such as "long branch effects" has not been exhaustively
>explored, but it might be at least advisable to try both, rather than
>limit your estimates of node robustness to bootstrap values alone (the
>most common practice).

Well, see my comment under 3 above.    I also wonder whether advocates of
SI/decay-indices would be willing to say that it might be advisable to
try bootstrapping too, rather than limit your estimates of node robustness
to SI alone?

I also note that SI can be done on many other methods of inferring phylogenies,
(for example seeing how much increase there is in the sum of squares
in a distance method when one bans a given branch).  Only in the case of
likelihood does it have a direct connection to statistics.

I have not put SI/decay into PHYLIP, not because it isn't worth doing but
just because we have not yet got constraints for/against a given group
built in yet, for technical and organizational reasons.

Sorry about the theological warfare here, but the literature on things
like foundations of inferring phylogenies and criticisms of bootstraps
is in a funny state right now, with a lot of oral tradition and not
many clear treatments in journal articles.

Joe Felsenstein         joe at genetics.washington.edu     (IP No.
 Dept. of Genetics, Univ. of Washington, Box 357360, Seattle, WA 98195-7360

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net