informative sites & bootstrap

Warren Gallin wgallin at gpu.srv.ualberta.ca
Tue May 5 13:20:53 EST 1998


In Article <6inick$54t at net.bio.net>, "James O. McInerney"
<j.mcinerney at nhm.ac.uk> wrote:
>
>There are two very separate issues here.  Firstly, do you leave the
>parsimony-uninformative sites in the alignment when you are sampling and
>secondly, do you leave the parsimony-uninformative sites in the analysis when
>you are searching for a tree?
>
>In my opinion the answer to the first [sampling] question is yes.  You should
>leave the parsimony-uninformative sites in the alignment when you are
>generating the samples.  The reasons are given in my answer.
>
>To the second question, I think the answer is no.  For precisely the reasons
>that were given by the submitter of answer 1.  

James,
    I don't see the distinction.  Uniformative sites add the same length to
all trees, so omitting them from the tree search (using MP) has no effect on
the trees thatt you find, only on the deduced lengths of the terminal branches.
    If you then include the uninformative sites in the bootstrap, you are
simply diluting out the signal.  The result is that you get two condfounding
factors in the resulting bootstrap tree 1) the thing that you want, an
estimate of how well the tree topology is supported by the informative sites
and 2) a thing that you do not want, a variable number of informative sites
in each bootstrap replicate.
    If you are building the tree on the basis of solely the informative
sites (by definition) then it seems to me that only the informative sites
should be included in the bootstrapping procedure.
    By the way, if you use a stepmatrix for amino acid sequence data
matrices, then none of the sites are by definition uninformative in all
trees, so the issue becomes moot.
Warren Gallin
Department of Biological Sciences
University of Alberta
Edmonton,  Alberta     T6G 2E9
Canada
wgallin at gpu.srv.ualberta.ca




More information about the Mol-evol mailing list