estimating Ks and Ka from phylogenetic trees

Frank Wright frank at sass.sari.ac.uk
Mon Mar 15 10:14:05 EST 1993

A colleague is interesting in testing whether the synonymous
and non-synonymous distances are statistically different for a 
protein coding DNA sequence.  The data available consists of
seven sequences known to have the following tree topology:

                                /  v
  a \                          /\  w
     >------------------------</   x
  b /                          \/  y
                                \  z

Rather than use the formula for the Variance of the distance
estimate (Jukes-Cantor method) calculated on the mean distance
(average of the 10 pairwise distances between <a,b> and 
<v,w,x,y,z> clusters)....

I presume that a more accurate test would be to do a t-test on Ka-Ks 
for the 10 pairwise distances from <a,b> to <v,w,x,y,z>.  This would
be tested against an H0 that Ka-Ks = 0.  I'm assuming that the
distances within the 2 clusters are negligable compared to the 
branch length connecting them, and the 10 pairwise distances 
give a more reliable estimate of the variance of the distance
estimate than does the theoretical formula for the variance of
the Jukes-Cantor distance. 

Is this a reasonable approach?  

Frank Wright
SASS, University of Edinburgh,
Edinburgh, Scotland, U.K.

e-mail: frank at sass.sari.ac.uk

