Help! Nucleotide substitutions

Doug Eernisse Doug_Ee at um.cc.umich.edu
Mon Aug 30 17:51:25 EST 1993


Robert.Kuhelj at ijs.si recently wrote:
>  
>are there any FTPable computer programs to estimate (calculate) numbers
of
>synonymous (Ks) and nonsynonymous nucleotide substitutions (Ka)?  
>
>
>I have already found some articles describing such calculations (e.g.
PNAS
>vol. 82, p.1741, and Meth. Enzymol. vol. 183, p.544) but I would prefere
>this job to be done by computer.
  
I've been meaning to implement this in one of my HyperCard stacks, DNA
Translator, which already calculates various pairwise of
transition/transversion 
differences, optionally calculated separately by codon position, so I have
gone ahead and programmed the above in the last week or so. However, I
don't have
much experience in how people may have improved these estimates in recent
years. 
So far, I have used the excellent description provided by Gojobori et al.
(Meth. 
Enzymol. vol. 183, p.531-550) to program the unweighted method of Nei and 
Gojobori for estimating the numbers of synonymous and nonsynonymous
substitutions, 
and the one-parameter method of Jukes & Cantor for calculating K(s) and
K(a) 
(or K(n)) estimates of the total number of nucleotide synonymous and 
nonsynonymous substitutions. It doesn't look like it would be much harder 
to implement the two-parameter method of Kimura in place of the Jukes & 
Cantor method, and perhaps also use the weighted method of Miyata and
Yasunaga
in place of Nei & Gojobori's unweighted method. Gojobori et al. give the
impression that, in practice, the improved estimates will likely be
rather similar to the simpler estimates. Is this impression generally
true, and when is it not true? The PNAS vol. 82, p.1741 weighted method
of Li et al. is another possibility, but this looked much trickier to
program. I also am not sure if there are any benefits to attempting to
figure out the 3-, 4-, or 6-parameter methods of Kimura discussed in
Gojobori et al. Can anyone offer advice? Does MEGA do all these and more?
   
The way I have implemented these calculations, one can specify the
genetic code to employ, and there is only an available RAM limitation
to the length and number of sequences compared. I have my own, rather odd,
reasons for wanting to be able to perform these calculations, plus I
thought it would be a good way to explore their assumptions.
I probably won't distribute it quite yet, but do email me if you can't
find another program which does these calculations (which I doubt).
Performing the calculations by hand would be tedious, indeed.
  
Doug Eernisse



More information about the Mol-evol mailing list