"Logic of Cladistics"

Bruce Rannala rannala at minerva.cis.yale.edu
Fri Jun 10 19:14:30 EST 1994


In article <2tb7ek$l8r at taco.cc.ncsu.edu> thorne at stat.ncsu.edu (Jeff Thorne) writes:
>From: thorne at stat.ncsu.edu (Jeff Thorne)
>Subject: Re: "Logic of Cladistics"
>Date: 11 Jun 1994 02:27:32 GMT


>>Realistic evolutionary models tend to be 
>>multivariate stochastic processes that defy analytical solution, and closed 
>>form expressions for the character state distributions (i.e., distribution 
>>functions or probability density functions) are generally unavailable. 

>The knowledge that a statistical model differs from reality is not unique
>to phylogenetics.  You will find that is true for virtually every
>application of parametric statistics.  If we believed that a model was
>perfect, we might call it "reality."  The more pertinent issue
>is robustness.  How do violations of the (implicit or explicit)
>model affect the inferences?  The hope is to gradually improve
>the understanding of evolution by testing assumptions and, if
>justified, replacing them. 

"reality" is much more easily modeled in some cases than in others. 
Radioactive decay, to cite a rather overused example, is described very well 
by an exponential distribution, whereas the death process for humans is not. I 
am very much in favour of ML estimates in cases where we know something about 
the underlying distribution, but the models often become too simple in the 
interests of mathematical expediency. My suggestion that evolution is 
typically a multivariate stochastic process is based on a very simple 
single-locus infinite-allele neutral model with variable population size, 
hardly "realistic," yet still highly intractable. While I agree that 
"robustness" is a reasonable criterion for a phylogenetic method, I would 
point out that your method of evaluating the effects of particular violations 
of model assumptions is inductive, so the "robustness" of your estimator 
depends on how thorough you are in your investigation of possible 
perturbations. Many of these issues are, of course, still unresolved in 
statistics generally. How do we choose estimators? Unbiased? Consistent? 
Efficient? etc. Perhaps phylogenetic estimation theory must adopt some such 
standards for comparing methods. Meanwhile, it would seem to me that a 
statistical approach to phylogeny estimation requiring a less complete 
knowledge of the underlying character-state distribution would be most 
appropriate (I have no idea what that might be).    


>>        It is very telling that so few professional statisticians have 
>>ventured into the phylogenetics controversy (compare this with the field of 
>>theoretical population genetics which has attracted some of the most 
>>brilliant probabilists of this century: Bartlett, Feller, Karlin, Kolmogorov 
>>and Moran to name only a few). 

>There is no doubt that there have been and continue to be many
>brilliant population geneticists.  It should be realized though
>that the  "phylogenies from sequence data" field is much younger.
>How many of the above population geneticists were appreciated five
>decades ago?  We should get back to this in fifty years.

Bartett, Feller and Kolmogorov were probably well appreciated fifty 
years ago, although I would guess that Karlin was still in Grade school. 
I agree that the field is relatively young and much may happen in the next 
fifty years.

>Jeff Thorne, Program in Statistical Genetics
>North Carolina State University

Bruce Rannala, Department of Biology, Yale University
rannala at minerva.cis.yale.edu



More information about the Mol-evol mailing list