Decay Indices - summary
Paterson, Adrian Mark
patersoa at lincoln.ac.nz
Fri Apr 21 11:57:17 EST 1995
Recently I asked about Decay Indices. I received some interesting
replies including a program for calculating these values (from
Torsten. Eriksson at botan.su.se).
Below are some of the responses (slightly edited) for those that are
interested in Decay Indices.
> Decay indices are really simple to perform in PAUP, I've never run them
> on any other program so I don't know the specifics. In general you run
> multiple runs keeping longer and longer trees than the most parsimonous,
> and make a strict cons. tree each time. How many extra steps it takes to
> collapse the branch is the decay index for that branch-- keep going until
> you get a comb.
> First run: Get most parsimonous tree: 1 tree @ 100 steps long
> 2nd: set up to keep all trees up to 1 step longer (100 and 101), in PAUP
> under Search/Heuristic/options select "all trees<length ( ) and insert
> "101". You will get, say 5 trees, 1 at 100 and 4 at 101 steps. Make a Strict
> consensus tree and note which branches are not resolved compared to the
> shortest tree. They will have a decay index of 1.
> 3rd: Repeat this, resetting the tree length until you run out of memory
> (number of trees increases exponentially) or until you collapse all of
> the branches. Branches that collapse at 102 will have an index of 2 etc.
> This gives very similar results to Bootstrap analysis, but can often
> bolster support for clades that have moderate to low bootstrap values.
> Some people who have philosophical objections to Bootstrapping will
> sometimes approve of decay indices. We run both routinely now.
> I'm sorry, I can't get to my references right now, but if you want the
> citations for the original proposal and some tests of decay versus
> bootstrap etc let me know, or contact David Hillis's lab in the Zoology
> department at here at UT, I believe one of his students has published on
> Alice Hempel hempel at bongo.cc.utexas.edu
A decay index is the number of additional steps it takes to break down a
node in the tree, or put another way, how many steps longer is an
alternative tree that does not recover that node? Finding decay indices
is a rather labor-intensive procedure. Here's what you would do in PAUP
to find the decay index for a single node (this must be repeated for each
(1) Load a constraint-tree defining only the clade of interest. For
example, if you have ten taxa (A to J) and want the deacy index for the node
defining the clade (D,E,F) then load the tree (A,B,C,(D,E,F),G,H,I,J).
(2) Search for the shortest trees that do NOT contain this clade, being
sure to use a search strategy as intensive as your original tree search.
(3) Once you have the length of the shortest tree that does not contain the
clade (D,E,F), subtract from this the number of steps in your most
parsimonious tree with no constraints - this is the decay index for the
node subtending (D,E,F).
Department of Entomology
University of Maryland
Basically, what you do to find decay indices is to find trees with +1, +2,
+3, etc., steps longer than your shortest tree(s). A strict consensus
tree is calculated for each of these. So a branch that was present on the
+2 strict consensus tree but collapsed on the +3 would be reported as
supported at the +2 level. The number of extra steps you want to test is
up to you but the number of trees increase exponentially with steps so
computer time is a consideration.
Jim Manhart, Dept. of Biology, Texas A&M University
J-Manhart at tamu.edu
It is my understanding that the upcoming PAUP 4.0 (not yet available yet!)
will calculate them directly. Steve Farris also has a program called "RNA"
that will do these calculations (U.S. $30 from the author, I believe).
You can also use DNA Translator, a freeware HyperCard stack written
by me, to simplify their calculation with PAUP. I call them "Support
Indices" (SI) because it makes more sense to speak of a high SI value
being associated with high support for a node, as Källersjö et al. first
pointed out (they call them Bremer Support Indices). If you want to
use DNA Translator, you can get it from:
where xx is version number or I will send you a more current copy
if you would like, but I'm about to head for Baja Calif. for a week 8^).
There are instructions in the "Specialized Instructions" folder,
inside "Sample Input/Output" but I have modified the documentation since
the last version was posted, so please excuse the following rather lengthy
and specialized text. Also, perhaps consult Eernisse & Kluge, MBE 10:
1170-1195 (1993) for an example. If you want, I will post an updated
version of the stack package to you.
-----------From current help file------------------------------
The "Support Indices..." option from the "N-out" field menu
of DNA Translator's utility cards is a _power user_ feature for
automating the calculation of "support index" (SI) values (for
and example, see Eernisse, D. J., and A. G. Kluge. 1993. Taxonomic
congruence versus total evidence, and the phylogeny of amniotes
inferred from fossils, molecules and morphology. Molecular Biology
and Evolution 10: 1170-1195.; also referred to by Donaghue and others
as "decay index" or by Farris and collaborators as "Bremer Support"
SI values vs. Bootstrap values:
A high SI generally corresponds to a high bootstrap value (with some
infrequent exceptions) but:
1. SI calculations are based on exactly the same data set as your
parsimony search (i.e., they are based on all available evidence).
2. SI calculations have a more direct and intuitive relationship to
the "robustness" of nodes.
3. SI calculations do not confound statistical support for the reality
of a particular tree with a separate issue of how likely the same tree
would be obtained if you had more characters from the same universe
of characters, whether or not it is the "true" tree.
4. Some of the assumptions of a bootstrap analysis are frequently violated,
for example, it depends on the i.i.d. assumptions identified by Felsenstein
and others, which require that the characters be identically and
independently distributed. Sanderson has separated these assumptions into
two less restrictive assumptions, namely, that characters are independent,
and that the observed charcter set is a "representative" sample of the
"universe of characters" (paraphrased from PAUP 3.1 manual, p. 56).
5. Whether or not SI values differ from bootstrap analyses in overcoming
systematic biases such as "long branch effects" has not been exhaustively
explored, but it might be at least advisable to try both, rather than
limit your estimates of node robustness to bootstrap values alone (the
most common practice).
More information about the Mol-evol