MrBayes - PAUP
David Hillis
dhillis at mail.utexas.edu
Tue Nov 5 16:03:12 EST 2002
Below Joe Felsenstein notes several hypotheses about the meaning of
Bayesian posterior probabilities. In a recent paper (Molecular
Phylogentics and Evolution 25:261-271), my co-authors and I show
support for Joe's proposal number two: namely, Bayesian posterior
probabilities are much better estimates of phylogenetic accuracy than
are non-parametric bootstrap values. We presented simulations of the
same type that Jim Bull and I presented in 1993 to show that
bootstrap values were strongly biased (Syst. Biol. 42: 182-192). We
found that Bayesian posterior probabilities were still somewhat
conservative as estimates of phylogenetic accuracy (in other words,
they still underestimate phylogenetic accuracy), but that they were
much closer than are bootstrap values.
>>>we have used MrBayes software on about 400 full length 18S rDNA (1700
>>>bp)with the TGR model of evolution. With MrBayes, the bootstrap support
>>>for deeper (and basically all) phylogenetic nodes is strikingly high
>>>(about 100%). Does anybody have the same experience with MrBayes ? Does
>>>anybody know some relevant literature on this ? (Our NJ trees with PAUP
>>>have remarkably low support at deeper nodes).
>>Let me correct one bit of terminology. The MrBayes support levels are not
>>bootstrap support but posterior probabilities. Your observation is a common
>>one, and a bit of a mystery. Quite a few people are doing simulations
>>to try to figure out why the Bayesian posterior values out of MrBayes are
>>so much larger than bootstrap values. There seem to be four
>>possible answers,
>>and as yet there is no consensus as to which is correct:
>> 1. Maybe these two numbers are fundamentally different quantities that are
>> not even to be compared, and not expected to be similar. In
>>which case,
>> if they are different this is not a problem for either.
>> 2. Maybe this reflects the well-known bias of high bootstrap P values,
>> which are biased downward (this was discovered in 1993 by Zharkikh
>> and Li).
>> 3. Maybe there are some search problems in MrBayes such that it tends to
>> get "stuck" on some trees, so that the Markov Chain Monte Carlo
>> algorithm reports much higher posterior probabilities of trees than it
>> should. This seems unlikely as Huelsenbeck has allowed for multiple
>> "heated" chains which (the Metropolis-Coupled Markov Chain Monte Carlo
>> MC^3 method of Geyer) which should enable a reasonable search.
>> 4. Maybe the high MrBayes posterior probabilities are consequences of the
>> particular prior distributions assumed on trees, which assume that
>> branch lengths are drawn from a distribution in which they
>>are frequently
>> allowed to be very long. This could then be remedied by altering the
>> prior that is used, and MrBayes does allow you some control over that.
>>I expect that over the next year or two the matter should be cleared up.
>>--
>>Joe Felsenstein joe at removethispart.gs.washington.edu
>> Department of Genome Sciences, University of Washington,
>> Box 357730, Seattle, WA 98195-7730 USA
