IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Please read. Thanks

Daniel Weinreich dmw at MCZ.HARVARD.EDU
Thu Feb 6 15:29:55 EST 1997

Dear Xuhua,
Sorry it's taken me two weeks to get this reply together.  And
nevertheless it may be totally off the mark, but here goes. 

I like your idea that second position (nonsynonymous) mutations change the
amino acid more (in functional e.g. Grantham distance terms) than do
nonsynonymous first and third position mutations, and that that effect helps
explain the low frequency of second position substitutions.  My only quibble
is with the way you frame your null hypothesis.  But I suspect that a more
realistic expectation won't detract from your most impressive p-value! 
> 	Of the 60 mitochondrial codons, there are 190 possible
> nonsynonymous codon pairs in which one codon can mutate into the other
> through a single nucleotide substitution (e.g., ACU-GCU). Of these 190
> pairs, 82 pairs differ at the first codon site (e.g., CCU and UCU), 84 at
> the second codon site (e.g., CCU and CGU) , 24 at the third codon site
> (e.g., CAU and CAA). Thus, when we compare two DNA sequences and count
> nonsynonymous substitutions, we expect 43.16% (=82/190) of the
> nonsynonymous substitutions to fall on the first codon site, 44.21% on
> the second codon site, and only 12.63% on the third codon site.

This H0 assumes all 60 codons appear with equal frequency, AND that all
190 nonsynonymous point mutations occur with equal likelihood.  Of course
neither of these assumptions obtain (and certainly not in mtDNA).  It
seems to me that you must weight your expectations by observed codon (and
especially amino acid) usage frequencies, and observed point mutation
frequencies (somehow inferred from 3rd positions?).  Let me give extreme
examples of each effect: 

CODON BIAS:  If all serines were coded for by UCC then there are three
nonsynonymous second position point mutations possible, but if all serines
were coded for by UCA, only one nonsynonymous second position point
mutations is possible (two in metazoan mtDNA), since the other two (one)
go to stops. 

AMINO ACID BIAS:  A peptide composed entirely of Val, Ala, Asp, Glu and
Gly (Gxx codons chosen for simplicity of analysis) has 99 possible
nonsynonymous point mutations, of which 47.5% are at the first position
48.5% at the second and 4% at the third.  These expectations are
distinctly different from the ones you've calculated. 

MUTATION BIAS:  If all serines were coded for by UCC but transitions
outnumber transversions 100:1, then the for all intents an purposes only 
one nonsynonymous second position mutation would be possible, even though 
the genetic code would seem to permit three.

I think you must account for these effects in setting up your null 
against which you test your observations.

> ... [data analysis and very low p-value deleted]...

> 	For the 82, 84 and 24 nonsynonymous codon pairs that differ at the
> first, second, and third codon sites, respectively, the mean Grantham's
> distance is 68.9, 100.5 and 68.3, respectively, with the first and third
> mean significantly smaller than the second mean. This confirms our
> hypothesis that nonsynonymous substitutions due to a nucleotide
> substitution at the second codon site involve amino acid replacement with
> more dramatic effect than those due to nucleotide substitutions at the
> first and second codon sites. 

Here too it seems to me that you should consider codon, amino acid and
mutation frequencies and calculate your mean Grantham's distances after
weighting for the likelihood of each nonsynonymous mutation occurring.

For example, if all serines were coded for by UCA, then its contribution
to the grand mean second position pairwise Grantham distance is 145 (Ser
<-> Leu), but if all serines were AGR, then their contribution to the mean
second position Grantham distance becomes 82 (mean of Ser <-> Asn [46],
Ser <-> Thr [58] and Ser <-> Ile [142]), and that's before weighting by
amino acid or mutation frequency.

In both your H0 and your Grantham-distance analysis I suspect these
modifications will not change your conclusions.

Or maybe I missed something basic?  ;-D


Daniel M. Weinreich			email: dmw at mcz.harvard.edu
Harvard University 			usmail: 26 Oxford Street
Museum of Comparative Zoology			Cambridge, MA 02138
voice: (617) 495-1954			fax: (617) 495-5846

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net