IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

relative rate test

Joe Felsenstein joe at evolution.genetics.washington.edu
Wed Jul 17 18:34:26 EST 1996

(Spencer Muse has asked me to post this for him as he is afflicted by
mailer problems -- J. Felsenstein)

In article <4sgfu9$mcv at nntp3.u.washington.edu>,
joe at evolution.genetics.washington.edu (Joe Felsenstein) writes:
|> In article <v01510100ae0fe8aa18c8@[]>,
|> Michael Nedbal <dna_seq at FMPPR.FMNH.ORG> wrote:
|> >With regards to the LRT, how does one determine
|> >which taxon (or taxa) are responsible for the rate heterogeneity?  
|> Actually likelihood ratio tests can also be done with three species,
|> it is just that one of the key programs (DNAML) happens not to be 
|> able to cope with three species, for purely silly reasons.  We're 
|> working on fixing that
|> in the next major version.  

In the meantime there are several options as far as available programs
go. I have a series of likelihood-based relative rate tests available by
anonymous ftp at kurtz.bio.psu.edu in the directory pub/ratetests. There
are tests based on the model of Hasegawa et al 1985 and also on the
codon-based model of Muse and Gaut 1994. The latter are the primary
methods of interest to me, since Tajima 1993 demonstrated that a simple
sign test construction of the rr test has power virtually identical to
the likelihood ratio tests of Muse and Weir 1992 (the ones based on the
HKY model), with less complexity and more relaxed assumptions. The
codon-based tests allow separate tests for synonymous and nonsynonymous
substitution rates, in a more rigorous manner than other options. The
other options include rr tests by Wu and Li 1985, Li et al 1987, and
several later modifications (mostly changes to the distance estimates for
synonymous and nonsynonymous rates). There is also the test of Li and
Bousquet 1992 which uses a rr framework to ask if _clades_ A and B have
evolved at equal rates, assuming equal rates within each clade. Takezaki
and Nei recently published what amounts to a clock test. Finally, I
believe that Ziheng Yang's PAML package will allow for relative rate
tests to be constructed, as well as the more general clock test that Joe
already mentioned. 

|> No one has ever
|> specified what one does with the RRT with more than three species, 
|> nor even how to do the three species case statistically.

I'm not sure what you mean here, Joe. Clearly valid statistical versions
of the relative rate test exist. Could you clarify?

|> If you want to know which branch or clade in the tree is responsible
|> for the rate heterogeneity, one could fit trees that were clocklike
|> except for having that branch (or all of that clade, alternatively)
|> evolving at R times the rate of the rest of the tree.  This is not 
|> easy to do with present-day programs, alas.  Aside from making that
|> possible, there are very interesting questions about how to test 
|> which branch (clade) it was that was going R times as fast.
|> But however that is to be done in a likelihood framework, the RRT has
|> more problems as it cannot tell you how to combine all the 
|> three-taxon tests.

Agreed. BUT, that does not imply that doing many or all pairwise
comparisons is a useless thing to do. And, in fact, many of the tests
_are_ independent (this can be argued along the lines from Felsenstein's
1985 (?) article on independent contrasts). Patterns arising from
exhaustive pairwise tests are often amazingly strong; so strong that it
would take incredibly high levels of correlation to explain them. But I
will reiterate Joe's comment that it isn't at all clear how the
collection of results should be properly interpreted. Rejection of a
global clock test should certainly be a prerequisite for looking further
at individual tests, reminiscent of so-called "F-protected" pairwise
comparisons in traditional ANOVA, where differences among particular
treatments aren't checked for unless a global F test indicates real
treatment differences. 

On a cynical final note, I do find it a bit intriguing that so much
concern is given to multiple correlated tests in one paper, but the
problem is almost completely ignored when a result from one paper is
followed up (in a correlated way) in other papers. (This is meant to be a
general comment, not poined at the rr test problem in particular.)


Spencer V. Muse                               | (814)863-7045
Institute of Molecular Evolutionary Genetics, | muse at kurtz.bio.psu.edu
Department of Biology                         |
Penn State University                         | FAX: (814)865-9131
University Park PA 16802-5301                 |

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net