informative sites & bootstrap

Peter Schuchert schuchert at ubaclu.unibas.ch
Tue May 5 10:05:06 EST 1998


to my original question:

> for a bootstrap analysis, do you use all sites of a sequence
> alignement or only the informative ones? 

I got the controversial answers given below. Are there more opinions?
Peter

1)
> Actually, that's a good question.  I presume you're referring to
> parsimony analyses, in which only the informative sites should be
> included for both boostrap analysis and for finding the best (maximum
> parsimony) tree.  If you include sites with autapomorphies, you'll
> basically be adding a step per autapomorphous site, so your
> consistency index will be underestimated.  If using model-based
> approaches for your molecular data (e.g., maximum likelihood or
> distance) use all the sites that are relevant (i.e., not including
> those where you have alignment ambiguity and so forth).

2)
> I believe that you should use all sites during bootstrapping.
> Each bootstrap replicate is a pseudo-sample of the universe of sites.  > It should, therefore,not be constrained by only sampling the 
> parsimony-informative sites.  This means that in some bootstrap
> replicates there will be _a lot_ of uninformative sites, and in some
> replicates there will be _few_ sites.  In the subsequent
> analysis of each bootstrapped dataset, the uninformative sites will
> not contribute to the phylogeny, however, they should be included in
> the sampling process.




More information about the Mol-evol mailing list