At the risk of adding to the confusion here, I bring up a point that has
troubled me for a while now. There seem to be two versions of "transition/
transversion ratio"floating around. Some folks use, say, the Kimura 2 parameter
model to estimate S, the expected number of transition changes, and V, the
expectd number of transversion changes, and then use the ratio of these two
estimates to estimate R=S/V. On the other hand, others will use the same model,
which has in its rate matrix parameters s (for transitions) an v (for
transversions), and estimate the parameter r=s/v. It seems to me that the
latter route is more appropriate. The parameter R has several undesirable
features: it depends on base frequencies, and it also depends on divergence
times. These facts make it difficult to compare R between two or more data
sets. On the other hand, r is free of both of these. It seems to be a more
natural parameter. John Wakely recently had an MBE article which touched on
this point. I don't know which parameter we are allowed to control in
Phylip (I hope Joe will tell us), but I'm not sure why it should be
necessary to perform multiple runs. If a 2 parameter model, such as Kimura or
its extension by Hasegawa et al. (1985) is used, the information falls out
immediately. It also does not force the same ts/tv ratio on all branches of
the tree (whether that is a plus or minus, I'm not sure). Of course, it does
require estimation of 2 parameters per branch rather than one, but the
computing expense there is minimal, especially when considered next to the cost
of searching through tree space.
Comments?
Spencer Muse
Institute of Molecular Evolutionary Genetics,
Department of Biology
Penn State University