carmean at sfu.ca (Dave Carmean) wrote:
>Any suggestions for the best (and hopefully simplest) way to discover if
>the substitutions in my DNA alignment are saturated?
>>I can use DNAdist (PHYLIP) to find the rates of substitution between taxa
>and Hillis's g1 test to indicate that the data set is not random. Neither
>of these directly address the question of saturation.
>
Getting back to the original question. You can estimate
a substitution pattern by comparing pairwise numbers of
steps from parsimony analysis with pairwise uncorrected
distances with a given tree. Herve Philippe has developed
some utilities in his MUST package (see Philippe, Nucl.
Acids Res. 21, 5264-5227) and has published several papers
with his "saturation plots" (see also Philippe et. al
J. Evol. Biol. 7, 247-265). If you use several methods
for branchlength approximation in the parsimony analysis
you can get an idea of the minimum number of changes
between two taxa. Comparison of uncorrected pairwise
distances will allow you to see whether there is a
region where the same distances correspond to vastly
varying numbers of inferred changes. At these pairwise
distances, then, you can claim there is saturation.
Another method is to plot uncorrected distances
against maximum likelihood pairwise distances on a tree.
It turns out (some unpublished data of Herve's
and mine) that you get much the same results as
for the parsimony inferences of branchlengths.
It also turns out, annoyingly, that many distance corrections
do not adequately correct for saturation (you still
see the saturation pattern)-both simple and complex
corrections. I can see how saturated
distances confound distance tree estimation. But its
not clear that parsimony and likelihood are going to
be affected in the same way. Can anyone help me
with this?
Cheers
Andrew J. Roger
aroger at ac.dal.ca