DNA substitutions saturated?

Doug Eernisse DEernisse at fullerton.edu
Thu Nov 30 02:18:27 EST 1995

In article <Pine.SV4.3.91.951129173041.16044B-100000 at uhunix4>,
palumbi at hawaii.edu (Steve Palumbi) wrote:

> Dave Carmean asks how to tell if mtDNA sequences are saturated. ONe way 
> is to graph the number of transversions versus the number of transitions. 
> If the graph levels off - i.e. number of transitions reaches some sort of 
> ceiling - then the sequences in this range are probably saturated for 
> transition substitutions. Note that this can occur at what seems a fairly 
> low divergence: sequences of COI that are 12% different may in fact be 
> saturated for transitions at silent sites. Note also that saturation can 
> be reached at lower divergence values if there is strong nucleotide bias 
> as in insect or crustacean mtDNAs.
> Steve Palumbi

Hi Steve,

  Saw your post and was reminded of a few things.
  We have ordered the new "Molecular Systematics" to use in my
"Computer Lab in Molecular Systematics" course here Spring '95.
I understand you have a chapter in it on PCR methods. Guess I will wait
until it shows up to read it, but your chapter should be appropriate for
what I have planned.

  I remember that you had a program that did some of the above (resp. to
Carmean) sorts of calculations. I was just struggling with trying to understand
how to improve my implementation of Synonymous/nonsynonymous substitution 
matrices and am stuck wishing I had some sample output from another
program to make sure I am doing it correctly. In particular, I
implemented it as in Nei and Gojobori Unweighted pathway method of Nei and 
Gojobori (1986) using Jukes and Cantor (1969) 1-parameter model, 
including estimation of the parameters
F(sd), F(nd), K(s) (± std. err.), and K(n) (± std. err.).
I followed the explanation given by Gojobori, T., E.N. Moriyama, and 
M. Kimura. 1990. Chapt. 33. Statistical methods for estimating sequence 
divergence. Meth. Enzymol. 183: 531-550 but unfortunately I am having a
great deal of trouble locating this article at the moment. There is no
reason that I am trying to sort this out other that my annoyance
with not having it working properly.

I previously got it working with their simple example previously but am 
left with two problems. First, what is the best way to handle gaps in the 
input alignment. Provisionally, I have ignored any "codon" triplet of adjacent 
sites (corresponding to an amino acid site) where there is any gap or 
nonstandard (besides ACGTU) symbol because I figured it would be better 
for the sake of ensuring comparability between pairwise calculations, 
and I don't think the interpretation of sites with missing data, or
comparisons based on a different selection of sites, is especially 
straightforward. Perhaps it is too severe to ignore the remaining
pairwise comparisons at those sites that is unambiguous. Problem two
is the fact that I am getting undefined (?) values of K(s) for many
but not all of the pairwise comparisons I have attempted. I don't
know why. I thought it might be due to problems with the standard 
1-parameter model used for these calculations as explained by Kimura 
(1980), but I am totally befuddled at the thought of how to incorporate
a more general model into these calculations, i.e., considering
transitions and transversions as distinct at the same time I am
estimating whether pairwise comparisons are synonymous or nonsynonymous.
I am hoping that you have been through this and could possibly give
me some sample data with associated synonymous/nonsynonymous pairwise
calculations. Failing that, perhaps you can offer comments on the
above problems.



Doug Eernisse <DEernisse at fullerton.edu>
Dept. Biological Science MH282
California State University
Fullerton, CA 92634

More information about the Mol-evol mailing list