rate variation in ML models
Mon Oct 13 09:59:59 EST 1997
Joe Felsenstein wrote:
> If there is no autocorrelation of rates among adjacent sites, the
> model is still i.i.d. (independent and identically distributed). But
> if, as is allowed in my DNAML and Yang's PAML, there is some autocorrelation
> among sites, then the model isn't i.i.d. This affects, for example,
> the validity of bootstrapping.
I had reached somewhat different conclusions about rate variation among sites
in usual implementations of the ML method. I understand that Felsenstein
and Churchill's (1996) hidden Markov model relaxes the independance assumption
for rates among sites, since some autocorrelation is assumed. However, my
opinion about Yang's approach is that the iid assumption is on. The way
I see present-day character states of any site are generated under Yang's
model assumptions is:
1- randomly draw a rate value in a (possibly discretized) Gamma distribution
2- make the site evolve according to the tree, branch lengths, other
substitution parameters (TI/TV ratios, etc...) AND the above chosen rate
This "process" appears commonly assumed for all sites since sites are not assigned
to a given category when the likelihood is computed. Therefore, the "identically
distributed" assumption seems here. The likelihood of a data set is computed by
multiplying the likelihoods of each site, suggesting that the "independant"
assumption is also on.
Is there something wrong here?
Nicolas Galtier
