Tree reliability

higgins at embl-heidelberg.de higgins at embl-heidelberg.de
Thu Nov 4 11:23:40 EST 1993

Just a quick practical note on using the bootstrap with data sets 
that include some very closely related sequences:

Make sure to use the JUMBLE option if you use PHYLIP.  This will randomise
the input order of the sequences in each bootstrap sample data set.  If you
do not do this, then some groupings in the trees may be over represented
in the final totals, depending only on the input order of the sequences.
Joe Felsenstein had the foresight to include this option in PHYLIP to avoid this
problem.   In the extreme case of identical or almost identical sequences,
you can end up with groupings that "appear" to get very high support in the
bootstrap analysis.  In reality, these groupings will be seperated from
nearby sequences by short or zero length branches i.e. thay are not real groups.
However, thay can artefactually get very high bootstrap support.  

I speak from bitter experience.  I did not include a "jumble" option in the
bootstrap code in Clustal V.   

Des Higgins
EMBL, Heidelberg, Germany.

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net