Codon rate ML models
johnh at brahms.biology.rochester.edu
Thu Dec 10 20:53:05 EST 1998
I believe that DNAML has/had a feature allowing the user to assign
sites to different categories, each with its own potential rate. Perhaps
a citation of the PHYLIP manual may be appropriate.
You may want to look at Goldman and Yang (MBE, 1994) and Muse (MBE, 1994)
in which the model of DNA substitution is expanded around the codon. That
is, instead of modeling the substitution process on a site-by-site basis,
you model the process on a codon basis. This means that the instantaneous
rate matrix is 61 X 61 instead of 4 X 4. The nice thing here is that when
you have a synonymous/nonsynonymous rate bias, that the model nicely mimics
the pattern seen in real data sets where the rate at second positions is
lowest followed by first and then third positions. Also, you can estimate
something of interest (i.e., the synonymous/nonsynonymous rate ratio) and
also relax, to some extent, the assumption of independence among sites;
instead of assuming that the substitution process at each site is
independent, you assume that substitutions at different codons are
independent. I believe that Ziheng Yang's program, PAML, implements
the codon model. My impression is that the practical utility of these
models is at the stage that 4 X 4 models were about 6 years ago; it is
possible to implement the method for relatively small data sets (say
25 species) on fast machines with lots of memory.
In article <74n23o$m88 at net.bio.net>, Thomas Buckley
<Thomas.Buckley at vuw.ac.nz> wrote:
>Can someone please give me references to any papers that discuss and or
>use maximum likelihood models that assign relative rates to different
>codon positions as a method of modelling among-site rate variation.
>This method is discussed briefly in the Swofford et al. (1996) molecular
>systematics chapter, and implemented in PAUP*4.0, but I can't track down
>any other references.
More information about the Mol-evol