DNA substitutions saturated?
DEernisse at fullerton.edu
Thu Nov 30 02:18:27 EST 1995
In article <Pine.SV18.104.22.1681129173041.16044B-100000 at uhunix4>,
palumbi at hawaii.edu (Steve Palumbi) wrote:
> Dave Carmean asks how to tell if mtDNA sequences are saturated. ONe way
> is to graph the number of transversions versus the number of transitions.
> If the graph levels off - i.e. number of transitions reaches some sort of
> ceiling - then the sequences in this range are probably saturated for
> transition substitutions. Note that this can occur at what seems a fairly
> low divergence: sequences of COI that are 12% different may in fact be
> saturated for transitions at silent sites. Note also that saturation can
> be reached at lower divergence values if there is strong nucleotide bias
> as in insect or crustacean mtDNAs.
> Steve Palumbi
Saw your post and was reminded of a few things.
We have ordered the new "Molecular Systematics" to use in my
"Computer Lab in Molecular Systematics" course here Spring '95.
I understand you have a chapter in it on PCR methods. Guess I will wait
until it shows up to read it, but your chapter should be appropriate for
what I have planned.
I remember that you had a program that did some of the above (resp. to
Carmean) sorts of calculations. I was just struggling with trying to understand
how to improve my implementation of Synonymous/nonsynonymous substitution
matrices and am stuck wishing I had some sample output from another
program to make sure I am doing it correctly. In particular, I
implemented it as in Nei and Gojobori Unweighted pathway method of Nei and
Gojobori (1986) using Jukes and Cantor (1969) 1-parameter model,
including estimation of the parameters
F(sd), F(nd), K(s) (± std. err.), and K(n) (± std. err.).
I followed the explanation given by Gojobori, T., E.N. Moriyama, and
M. Kimura. 1990. Chapt. 33. Statistical methods for estimating sequence
divergence. Meth. Enzymol. 183: 531-550 but unfortunately I am having a
great deal of trouble locating this article at the moment. There is no
reason that I am trying to sort this out other that my annoyance
with not having it working properly.
I previously got it working with their simple example previously but am
left with two problems. First, what is the best way to handle gaps in the
input alignment. Provisionally, I have ignored any "codon" triplet of adjacent
sites (corresponding to an amino acid site) where there is any gap or
nonstandard (besides ACGTU) symbol because I figured it would be better
for the sake of ensuring comparability between pairwise calculations,
and I don't think the interpretation of sites with missing data, or
comparisons based on a different selection of sites, is especially
straightforward. Perhaps it is too severe to ignore the remaining
pairwise comparisons at those sites that is unambiguous. Problem two
is the fact that I am getting undefined (?) values of K(s) for many
but not all of the pairwise comparisons I have attempted. I don't
know why. I thought it might be due to problems with the standard
1-parameter model used for these calculations as explained by Kimura
(1980), but I am totally befuddled at the thought of how to incorporate
a more general model into these calculations, i.e., considering
transitions and transversions as distinct at the same time I am
estimating whether pairwise comparisons are synonymous or nonsynonymous.
I am hoping that you have been through this and could possibly give
me some sample data with associated synonymous/nonsynonymous pairwise
calculations. Failing that, perhaps you can offer comments on the
Doug Eernisse <DEernisse at fullerton.edu>
Dept. Biological Science MH282
California State University
Fullerton, CA 92634
More information about the Mol-evol