# A philosophical question

Michael A. Charleston m.a.charleston at mail.utexas.edu
Fri May 5 19:31:41 EST 1995

```I'm new to this, so forgive me for any clangers I may make.

Cosider the following situation, which will be described very loosely
because otherwise I'll go on forever:

You have a phylogeny problem on n taxa, and the solution space consists of
the set of all (2n-5)(2n-7)...(3)(1) trees, which is an awful lot, and you
can generate a graph (= set of points connected by lines) using some
perturbation, say NNI (Nearest Neigbour Interchange), such that any two
trees are connected by an edge iff it's possible to go from one to the
other with a single such perturbation.
The objective function on this space is something like the parsimony
length or the log-likeligood.
So you can imagine "hills" and "troughs" in this graph, where the "height"
is proportional to the goodness - to be maximised - of the trees.
Hence - sorry if this seems stupid, since there are so many people using
parsimony - you're maximising this objective function.

Anyway - the situation is this: you have a great many trees in one big
hill, so that starting at any one of these trees and pacing through this
landscape by only going to trees which are better than the current one,
you get to the top of this broad hill.  Somewhere far removed from this
broad hill is a single spike, so there are many fewer starting trees (and
climbs) that would lead you to the top of this spike, but the objective
function value is higher at the top of this spike than it is at the top of

My question is this: which tree - or trees - do you consider as "better"?

A while ago I asked Swofford this question and his immediate choice was
the tree(s) at the top of the spike, presumably since they optimise the
objective function and that must be the point of the whole exercise, but
if it's just a single seemingly aberrant tree, what then?  Where does one
draw the line?

My view is that there should (ideally) be some commonality between the
trees at the top of the hill and at the top of the spike, which is being
obscured by out choice of tree perturbation (and therefore tree
adjacency), and that by choosing more appropriate criteria for "closeness"
of two trees we may figure out what's going on.  This may be very naive of
me: I would appreciate the views of those reading this.

Knowing some properties of the "slopes" of this kind of graph would
clearly be useful in determining when this situation will arise, if ever,
but that's not the question.  If it *does* arise, what do you do?

Mike Charleston

```