IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

"Logic of Cladistics"

Erich Schwarz schwarze at starbase1.caltech.edu
Sat Jun 11 18:02:36 EST 1994

In article <Cr8nAE.HHt at zoo.toronto.edu>, mes at zoo.toronto.edu (Mark Siddall)

> I am increasingly
> concerned with a tendency to crank off sequences, fire them into some
> phylogenetic software like PHYLIP, PAUP, MEGA or Hennig86 without
> full understanding of what's going on.  The argument can be made that
> it is an awkward, if not dangerous, thing to simply pump out a
> parsimony tree, a Fitch-Margoliash distance tree, a UPGMA, and whatever
> and publish them all side by side. There appears to be in some arenas
> a fundemental ignoring of the issues, as though phylogenetic investigation
> was so-much recipe work.  It is not.  There are issues that need to be
> addressed by all practitioners in their analyses regarding multiple
> trees, assumptions, defensibility of a chosen approach, and so much
> more...

   Good point.  So, how *does* one decide which approach is most defensible
for a given database of conceptual-proteins?  Especially when what one is
comparing are not whole proteins but *motifs* that may have been diverging
for literally over a billion years?
   As far as I can make out, everybody agrees that different methods
yielding different trees is a scary problem, and there is much harrumphing
in the reviews I've read that "users must understand the assumptions of the
different methods and rationally choose between them blah blah blah." 
Alas, the reviews aren't exactly forthcoming about just *what* algorithm
one is supposed to use to decide which method is indeed most appropriate to
one's case...
   In one of my own newly-discovered sequence motif families, "Calx-alpha",
one evolutionary model (that I happen to like) seems to be weakly supported
by protein parsimony, but partially contradicted by protein distance.  It's
far from obvious what I can do about that -- except, in fact, publish the
two trees side by side, showing bootstrap values for the nodes and
explaining why, in fact, I think the parsimony tree is more intellectually
pleasing.  Yes, that's a tacky thing to do; but what else am I *supposed*
to do?  Build a time machine?  Wait 10 years, until the Genome Projects
pump out 20 times as many Calx-alpha motifs for me to crunch?  Selectively
publish the parsimony tree, relegating the distance tree to "data not
shown"?  Not discuss evolution at all?
   Molecular phylogeny shouldn't be "recipe work."  But, beyond a point, it
seems to be impossible for it not to be -- at least at this early stage.

--Erich Schwarz

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net