In discussing the two recent "mitochondrial Eve" reanalyses published in
SCIENCE (255:737-739) Dave Swofford gave a good description of the problems
encountered by parsimony programs in handling large data sets. However, as an
author of one of those reanalyses, I must respond to Dave's comment that our
reanalysis was "not nearly as thorough" as his (Maddison, Ruvolo, & Swofford,
in press). Although his paper has not yet been published, I have seen a copy
of the manuscript.
First, the conclusions of all three reanalyses are the same: parsimony
analysis of this data set cannot resolve the geographic origin of modern
humans. Second, the most-parsimonious tree length that we report (522) is
several steps shorter than those reported by Maddison et al. I recently
learned from David Maddison that the difference in tree length is apparently
due to the inclusion or exclusion of one or a few sites in the different
analyses. When he re-did his reanalysis with the same sites that we used, he
obtained mpt lengths of 521 (this will be mentioned as a "note-added-in-proof"
in their paper). I am confident that even shorter trees can be found, if one
is actually interested in looking for them.
The approach taken by Maddison et al. was to do many searches and to
save only a few trees during each search. Our approach was to do 5 searches,
saving a very large number of trees (10,000) with each search (the advantage
of this approach is that shorter-length trees are encountered as the number of
trees stacks up; each time restarting the count of 10,000). These two
different approaches with the same data set resulted in mpt's of identical or
nearly identical length and the same conclusion: no resolution. Thus I see no
evidence that one was more thorough than the other. Certainly Templeton's
reanalysis was not as thorough (100 mpts; each several steps longer) but he
came to the same conclusion nonetheless.
I should point out that the second analysis we presented (p. 738, Fig
1B) using the neighbor-joining method resulted in a single tree very quickly
(minutes) allowing the option of bootstrapping (2000 replications).
[Bootstrapping was not possible with parsimony because a single cycle of the
program could not be completed - i.e., all of the mpt's could not be found -
and our PAUP analysis took 2 weeks on a Silicon Graphics computer!]. Although
the bootstrap p-values on the nj-tree were low, that analysis did support an
African origin and did extract information from the data set that the
parsimony analysis could not - e.g., that all members of the !Kung tribe form
a single group.
I do not argue that parsimony is necessarily an inferior method of
analysis - it is quite powerful and useful in many cases (this is not one of
them) - only that systematists should be more open-minded about methods of
analysis. There several very good methods available for analyzing DNA
sequences, and unyielding adherence to one method (maximum parsimony),
especially when it fails, is not healthy for systematics.
S. Blair Hedges
Institute of Molecular Evolutionary Genetics
Department of Biology
Penn State University, University Park, PA 16802