Molecular clocks and topologies

Andrew Rambaut andrew.rambaut at zoo.ox._remove_this_.ac.uk
Tue Sep 19 07:23:43 EST 2000


[[ This message was both posted and mailed: see
   the "To," "Cc," and "Newsgroups" headers for details. ]]

In article <8q6b79$hvt$1 at mercury.hgmp.mrc.ac.uk>, Chris Conroy
<chris.conroy at stanford.edu> wrote:

>      I would like to date certain nodes within a molecular phylogeny by
> calibrating other parts with fossil dates and using the branch lengths
> of a clock-constrained tree as units of time.  Unfortunately, a recent
> test I did found the clock constrained tree was significantly less
> likely than the unconstrained tree, but was also a different topology.
> These are ML trees based on a GTR+I+G model. Could anyone out there
> offer a succinct explanation for what a program such as PAUP is doing as
> it forces a molecular clock constraint?  Why would the topology have to

To understand this you must consider how much PAUP has to stretch or
shrink the non-clock branch lengths in order to get the
clock-constrained branch lengths. If the sequences evolved in a
true clock like manner then these differences would be only small
and stochastic and non-significant. If not clock-like then it
may reduce the likelihood less by introducing a topology change than
to force some branch lengths to change. 

There is another possibility: PAUP may not have found the maximum
likelihood clock tree. With these constraints, the branch optimisation
becomes more difficult and the heuristic search can become stuck on
a local optimum. There is also N times more possible topologies than
in your non-clock tree (N being the number of sequences). 

One possibility is to take your non-clock tree
topology, root it at all possible positions (perhaps using TreeEdit:
<http://evolve.zoo.ox.ac.uk/software/TreeEdit/>) and pick the one
with the highest likelihood. Compare this with your tree searched for
under clock constraints. 

> change? Is there any reason to pinpoint sequences that look like they
> have long branches in the unconstrained tree to leave out?  What about
> those sequences that jump around between constrained and unconstrained
> trees?  Is it "fair" or statistically valid to sequentially remove
> outliers until the tree behaves in a clock-like manner?  I have seen
> this latter technique used, but was curious about what the general
> opinion is.

Possibilities:

1) Model change in the rate across the tree:

Huelsenbeck, J.P., Larget, B. and Swofford, D.L. (2000) A compound
Poisson process for relaxing the molecular clock. Genetics 154:
1879-1892.

Thorne J.L., Kishino, H. and Painter, I.S. (1998) Estimating the rate
of evolution of the rate of molecular evolution. Mol. Biol. Evol. 15:
1647-1657.

See also (non-parametric approach):

Sanderson, M. J. (1997) A nonparametric approach to estimating
divergence times in the absence of rate constancy. Molecular Biology
and Evolution. Dec., 14, 1218-1231.

2) Allow a different rate for each part of the tree for which a fossil
calibration is available:

Rambaut, A. and Bromham, L. (1998) Estimating divergence dates from
molecular sequences. Mol. Biol. Evol., 15, 442-448.

3) Either remove outliers or allow them to have different rates. I have
a number of cases where there is one clear outlier and the clock test
goes from highly significant to non-signficant when this is allowed a
different rate. This process is essentially unconstraining them -
keeping them in the analysis but allowing them to have their own branch
length. This means you can compare this model with the
clock-constrained model with 1d.f. and see if doing this makes a
significant improvement. Ideally you should pick your outlier
_a_priori_ using some other criterion (i.e. been identified as fast
evolving using other data, strange life-history or something).
If not then you may have to do some simulations to produce an
appropriate null distribution:

Simulate on the clock-constrained tree. For each simulation relax
the branch length for each tip in turn and pick the one that gives
the biggest improvement in likelihood. Add this likelihood ratio to
your null distribution. Repeat. 

If you want to discuss this further, email me (I have various programs
that can do these sorts of analyses).

Andrew

===================================================================
  Andrew Rambaut,             EMAIL - andrew.rambaut at zoo.ox.ac.uk
  Zoology Department,           WWW - http://evolve.zoo.ox.ac.uk/
  University of Oxford,         TEL - +44 1865 271261
  South Parks Road, Oxford, UK  FAX - +44 1865 271249
===================================================================


---







More information about the Mol-evol mailing list