IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

DNA substitutions saturated?

higgins at ebi.ac.uk higgins at ebi.ac.uk
Thu Nov 30 04:37:05 EST 1995

In article <carmean-2911951219520001 at berbee.botany.ubc.ca>, carmean at sfu.ca (Dave Carmean) writes:
> Any suggestions for the best (and hopefully simplest) way to discover if
> the substitutions in my DNA alignment are saturated?
> I can use DNAdist (PHYLIP) to find the rates of substitution between taxa
> and Hillis's g1 test to indicate that the data set is not random.  Neither
> of these directly address the question of saturation.
> Thank you in advance,
> Dave Carmean  carmean at sfu.ca
> Biological Sciences
> Simon Fraser University          
> Burnaby, BC  CANADA V5A 1S6

Hi Dave:  

a rule of thumb that I have seen and which is a reasonable guide in practice
is to see if any CORRECTED distances are over 1.5 substitutions estimated
per site (equals 150% divergence which sounds so weird that it is usually
not described like that).  Use DNADIST and the Kimura 2 parameter correction
for example and see if any distances are more than 1.5.  If one or two
are in a large dataset that is no big deal (maybe) but if many of them are
there might be trouble.

This "rule" is purely heuristic and is just a guide.   The reasoning is that
if you have more than 1.5 substitutions per site, then it becomes
extremely difficult to accurately estimate distances and any bias in the
substitution process will well and truly scramble things even further.  

And before the flame war starts ........ this is not just a problem with
distance methods.  Such data sets will be hard work for ANY method.

Des Higgins

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net