IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Ka/Ks ratio

A.P.Jason de Koning apjdk at albany.edu
Wed Oct 16 21:28:14 EST 2002

Hi Michael,

James said:
> We have seen this case and we just handle it by saying that positive
> selection has been very strong - just non-synonymous changes and no time
> observations of silent changes.

I do not agree with James' comments.  Having an inestimable dS may or *may
not* be indicative of positive selection.  This is a common issue when
analysing fairly closely related taxa, or trees with short internal
branches.  If there are a small total number of substitutions, then by
chance (under a null model of neutral evolution) you should expect to often
see zero synonymous changes, since there are always several times fewer
synonymous sites.  As George suggested, however, you still can test if dN is
'significantly' greater than dS on a branch where dS is inestimable.  One
way is to constrain omega (dN/dS) on that lineage to equal 1.0 (see the PAML
documentation), and evaluate the likelihood of your data under this null
model.  Compare the likelihoods under both models with a likelihood ratio
test (LRT):

LR:  2 x ( likelihood of original model - likelihood of constrained model)

You can compare this statistic to a chi-squared distribution with 1 degree
of freedom, to test if the more complex model (the original) significantly
better fits your data than the constrained (null).  If it does (P<~0.05),
then dN is significantly elevated beyond dS (possibly positive selection),
though you can't really tell by how much.  For short branches, such as in
your case, expect the test to have low power, and to be conservative.  Have
a look at Yang's paper for details on how one can setup different tests of
positive selection using LRTs (Yang, 1998.  MBE 15:568-573).

An alternative approach is to use simpler statistics.  Basically, to test
for positive selection you want to assess if the ratio of non-synonymous to
synonymous differences (n:s) is the same as the ratio of non-synonymous to
synonymous sites (N:S).  If it is not, then neutral evolution is rejected.
If you look at PAML's output, you can get an estimate of the number of
non-synonymous differences (n): multiply dN for your short branch by N, and
you get the approximate number of non-synonymous substitutions along that
branch.  You can then use PAML's estimates of N, S, and (dN x N = n) to
perform a Fisher's Exact test.

The test statistic is:
    P = { [ (n+s)! (N-n + S-s)! N! S!] / (N + S)! } / [ n!s!(N-n)!(S-s)! ]

In practice, you'll have to round the argument of each factorial to a whole

Remember, testing for positive selection requires not just the estimation of
dN and dS, but ALSO a test of their equality.

Hope this helps.

- Jason

 A.P. Jason de Koning, Doctoral student        Email: apjdk at albany.edu
  Department of Biological Sciences              Lab: (518) 442-4347
  University at Albany, SUNY                     FAX: (518) 442-4767
  1400 Washington Ave., Albany NY 12222, USA   Mobil: (518) 210-4504

"Michael Steiper" <steiper at fas.harvard.edu> wrote in message
news:aoi30d$niv$1 at mercury.hgmp.mrc.ac.uk...
> Hi-
> I am trying to detect positive natural selection in a group of sequences
> using Ka/Ks comparisons.  The problem is that my number of Ks is zero in
> one of my branches.  With this zero in the denominator, I get undefined
> answers and strange results using the PAML program.
> Has anyone seen a case like this?  It seems like someone must have
> encountered this before, and I'd like to see how it was handled.  Any
> other ideas might be helpful, too.
> THanks,
> Michael Steiper


More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net