ML question

Joe Felsenstein joe at evolution.genetics.washington.edu
Fri Sep 20 00:55:13 EST 1996


In article <51rn4t$99k at mserv1.dl.ac.uk>,
James O. McInerney PhD <j.mcinerney at nhm.ac.uk> wrote:
>
>Is there a way of calculating a likelihood topology for a gene where
>there is more than one model of sequence evolution.  For instance some
>parts of the gene are evolving with a higher transition/transversion
>ratio than others (say, stems in a tRNA as opposed to loops).  So you
>want to incorporate all of this information into your model.
>
>Categorising sites alone is not quite what I have in mind.  I want to
>categorise a site and evaluate the probability of observing this site
>for model X (say) given a particular topology and branch lengths. And
>for sites from the other category I want to slightly change the model
>and evaluate the probability of observing this site for model Y (say). 
>The likelihood of the tree then becomes the product of all of these
>likelihoods (even though the likelihoods were derived from different
>models).
...
>Is this possible?

Yes, using Hidden Markov Model technology.  This has been used by Ziheng Yang
in some papers, and by me and Gary Churchill last January in MBE.  We did it
for models that differed in rate of evolution, but commented that it could
also be done for models that differed in other ways, such as different base
compositions.

I _think_ you want to aloow the program to infer where the different models
are.  HMM's actually don't do that, but something better -- they add up the
likelihood over all possible assignments of models to sites, weighting these
properly with respect to an assumed "hidden" model that assigns processes to
sites.  Thus one can assert that processes come in patches but not specify
where the patches are to be.

The only computer programs I know that do this currently and Yang's PAML
and my DNAML and DNAMLK from PHYLIP, but in both cases this is only for
variation in rates of evolution.

-- 
Joe Felsenstein         joe at genetics.washington.edu     (IP No. 128.95.12.41)
 Dept. of Genetics, Univ. of Washington, Box 357360, Seattle, WA 98195-7360 USA



More information about the Mol-evol mailing list