In article <carmean-2911951219520001 at berbee.botany.ubc.ca>, carmean at sfu.ca (Dave Carmean) writes:
> Any suggestions for the best (and hopefully simplest) way to discover if
> the substitutions in my DNA alignment are saturated?
>> I can use DNAdist (PHYLIP) to find the rates of substitution between taxa
> and Hillis's g1 test to indicate that the data set is not random. Neither
> of these directly address the question of saturation.
>> Thank you in advance,
> Dave Carmean carmean at sfu.ca> Biological Sciences
> Simon Fraser University
> Burnaby, BC CANADA V5A 1S6
a rule of thumb that I have seen and which is a reasonable guide in practice
is to see if any CORRECTED distances are over 1.5 substitutions estimated
per site (equals 150% divergence which sounds so weird that it is usually
not described like that). Use DNADIST and the Kimura 2 parameter correction
for example and see if any distances are more than 1.5. If one or two
are in a large dataset that is no big deal (maybe) but if many of them are
there might be trouble.
This "rule" is purely heuristic and is just a guide. The reasoning is that
if you have more than 1.5 substitutions per site, then it becomes
extremely difficult to accurately estimate distances and any bias in the
substitution process will well and truly scramble things even further.
And before the flame war starts ........ this is not just a problem with
distance methods. Such data sets will be hard work for ANY method.