In article <2tb7ek$l8r at taco.cc.ncsu.edu> thorne at stat.ncsu.edu (Jeff Thorne) writes:
>From: thorne at stat.ncsu.edu (Jeff Thorne)
>Subject: Re: "Logic of Cladistics"
>Date: 11 Jun 1994 02:27:32 GMT
>>Realistic evolutionary models tend to be
>>multivariate stochastic processes that defy analytical solution, and closed
>>form expressions for the character state distributions (i.e., distribution
>>functions or probability density functions) are generally unavailable.
>The knowledge that a statistical model differs from reality is not unique
>to phylogenetics. You will find that is true for virtually every
>application of parametric statistics. If we believed that a model was
>perfect, we might call it "reality." The more pertinent issue
>is robustness. How do violations of the (implicit or explicit)
>model affect the inferences? The hope is to gradually improve
>the understanding of evolution by testing assumptions and, if
>justified, replacing them.
"reality" is much more easily modeled in some cases than in others.
Radioactive decay, to cite a rather overused example, is described very well
by an exponential distribution, whereas the death process for humans is not. I
am very much in favour of ML estimates in cases where we know something about
the underlying distribution, but the models often become too simple in the
interests of mathematical expediency. My suggestion that evolution is
typically a multivariate stochastic process is based on a very simple
single-locus infinite-allele neutral model with variable population size,
hardly "realistic," yet still highly intractable. While I agree that
"robustness" is a reasonable criterion for a phylogenetic method, I would
point out that your method of evaluating the effects of particular violations
of model assumptions is inductive, so the "robustness" of your estimator
depends on how thorough you are in your investigation of possible
perturbations. Many of these issues are, of course, still unresolved in
statistics generally. How do we choose estimators? Unbiased? Consistent?
Efficient? etc. Perhaps phylogenetic estimation theory must adopt some such
standards for comparing methods. Meanwhile, it would seem to me that a
statistical approach to phylogeny estimation requiring a less complete
knowledge of the underlying character-state distribution would be most
appropriate (I have no idea what that might be).
>> It is very telling that so few professional statisticians have
>>ventured into the phylogenetics controversy (compare this with the field of
>>theoretical population genetics which has attracted some of the most
>>brilliant probabilists of this century: Bartlett, Feller, Karlin, Kolmogorov
>>and Moran to name only a few).
>There is no doubt that there have been and continue to be many
>brilliant population geneticists. It should be realized though
>that the "phylogenies from sequence data" field is much younger.
>How many of the above population geneticists were appreciated five
>decades ago? We should get back to this in fifty years.
Bartett, Feller and Kolmogorov were probably well appreciated fifty
years ago, although I would guess that Karlin was still in Grade school.
I agree that the field is relatively young and much may happen in the next
fifty years.
>Jeff Thorne, Program in Statistical Genetics
>North Carolina State University
Bruce Rannala, Department of Biology, Yale University
rannala at minerva.cis.yale.edu