likelihood (?) inference from phylo. distribution

Arlin Stoltzfus arlin at is.dal.ca
Thu Mar 7 18:05:59 EST 1996


An argument about the meaning of a restricted phylogenetic distribution
has come up a handful of times in discussions about the origin of
spliceosomal introns, C-to-U RNA pan-editing, gRNA-mediated
pan-editing, rRNA "expansion segments" in eukaryotes, etc.  Each of
these features has a restricted phylogenetic distribution, but each has
been proposed to be a primitive feature inherited from an ancient
ancestor (usually in the "RNA world").

For instance, C-to-U pan-editing is restricted to angiosperm
mitochondria. In a review on this type of editing a few years ago, some
colleagues made what seemed to me to be a fair and obvious statement to
the effect that the proposal of an ancient origin could not be
excluded, but was not supported by any evidence.  One of the referees
objected strongly, saying that this was needlessly offensive to the
fine people who proposed an ancient origin, since the same status also
applied to a recent-origin explanation-- theoretically possible but
without evidence.  Apparently, the referee felt that the phylogenetic
distribution was irrelevant and was NOT unfavorable to an ancient
origin of C-to-U pan-editing followed by loss in genomes other than
those of angiosperm mitochondria.

I have seen a similar difference of opinion in other contexts.  My intuition suggests that 
the referee's view is wrong, but I have not seen
a rigorous and valid general argument against it.  In the context of the
debate on spliceosomal introns, it has been claimed that an ancient
origin (followed by loss in archaebacteria and eubacteria, and complete
loss or severe diminution in many protists) is "unparsimonious".  This
invocation of "parsimony" is surely mistaken.  From a logical-parsimony
perspective, since the ancestral character state is at issue, one
cannot claim to be able to deduce on prior grounds that the intronless
(or intron-containing) state is an apomorph (or a plesiomorph).  If one
is inferring ancestral states without determining polarity (i.e., not
the "logical parsimony" attitude but the other kind), this is
invalid because one doesn't know that one has the correct weightings.

So, I don't think the parsimony line of argument holds water.  For the case
of spliceosomal introns, a recent origin posits 1 event of recent
intron origin, while an ancient origin would posit on the order of 25
events of genomic "streamlining" to remove introns.  But who is to say
that 1 event of one type is better than 25 events of a completely
different type?  To me, the reason that the phylogenetic distribution
of spliceosomal introns argues against an ancient origin is not that 1
gain is better than 25 losses, but that, of all possible ways of
distributing 25 losses on a complex phylogenetic tree, there are very
few of them that would result in the presence of the ancestral
character state in only single clade (rather than being scattered
throughout the tree).  By contrast, of all possible ways of
distributing one event of gain on a tree, all of them will result in
the derived character state being present in only a single clade.

This sounds like some kind of maximum likelihood argument.  Does anyone
have any clues about how to make this a rigorous algorithmic argument
for the case of a known tree and a known presence/absence character
state distribution, without making ANY other assumptions?  

Arlin Stoltzfus (arlin at is.dal.ca)
Department of Biochemistry
Dalhousie University
Halifax, NS  B3H 4H7 CANADA
ph. 902-494-3569; fax 902-494-3569





More information about the Mol-evol mailing list