In your posting, you said (and I paraphrase)
that maximum parsimony requires that all
sites evolve under identical processes.
Actually, interestingly, Chris Tuffley
and Mike Steel have recently proved
something quite interesting, which
is relevent to this discussion:
if the sites evolve under independent
processes, but not necessarily under
any common mechanism, then maximum
parsimony and maximum likelihood are
exactly equivalent, which is to say
that the rank ordering on the set of
leaf-labelled tree topologies induced
by MP is the same as the rank ordering
induced by ML.
What is meant by "no common mechanism" is that
for every pair (edge,site) there is a
probability of substitution, and that given that
a substitution occurs, the probability of
change between every pair of nucleotides is
the same. This model is more likely to be
biologically realistic than the usual Jukes-Cantor
model, I suspect, which requires that the
sites evolve under identical processes.
Under this general model, it may not be
possible in all cases to obtain the true
tree, even given infinite data, no matter
what method is used (as was shown in earlier
papers by other authors), so that even
ML can be "inconsistent".
Thus, for a biological model of site
substitution, MP is the same as ML, and
consequently consistent on the set of
trees under which ML is consistent, and
vice versa. The results (originally by
Felsenstein and observed by others) that
MP is not consistent under the Jukes-Cantor
model of evolution do not contradict these
results -- if it is possible to constrain the
space of model trees to only those that have
iid site evolution, then ML can select the
correct tree (as can distance methods
using corrected distances), but that
without such constraints, even ML can be
inconsistent, even if the general
properties of the model are known.
This result in some basic way does give
maximum parsimony a "statistical basis",
and the question may really come down to
figuring out what the properties of
real biological data are likely to be,
so that for particular data sets the
appropriate methods can be selected.
At the same time, to the extent that
the objective is more than just getting
the tree topology, ML will always be
useful in ways that MP cannot be quite
as useful, but for those people who seek
the tree topology primarily, MP may be
the "right" way to go, unless additional
properties about the evolutionary
process can be inferred which can narrow
the search space (and hence make
MP not equivalent to ML).
I suggest that you see the Tuffley and
Steel paper:
Chris Tuffley and Michael Steel,
"Links between maximum likelihood and
maximum parsimony under a simple model
of site substitution"
Bulletin of Mathematical Biology,
59(3), 581-607, 1997.
Tandy Warnow
University of Pennsylvania
Department of Computer and Information Science
tandy at central.cis.upenn.edu