Outgroups and polarized character states

Brian Foley btf at t10.lanl.gov
Tue Nov 4 15:11:40 EST 1997


Andrew Mitchell wrote:
> How does the outgroup polarize the character states?  
> Quite simply, the node joining the outgroup to the ingroup 
> is the most basal node in your phylogeny.  If you insist 
> on looking at it on a character-by-character basis, then 
> that state found in the outgroup (assuming there is only 
> one outgroup taxon) is assumed to the primitive, thus 
> polarizing your states).

Which phylogenetic methods do this (treat the outgroup
as ancestral)?  Are you saying that the outgroup "should be" 
treated as primitive?  Or that is "is" treated as primitive by
a ceertain method?

If we have that following sequences:

          1          10         18
outgroup: gaa cat tac ggc ggg cca
species1: gaa caa atc ggc gga cct
species2: gat caa atc gac gga cct
species3: gat cag ttc gga ggg cgt

I would suspect that at position 3 the ancestor was "a"
as some of the ingroup as well as the outgroup have "a"
here.  However, at position 18, it is equally likely
that the ancestor was "t" and the outgroup mutated to
"a"; or the ancestor was "a" and the ingroup mutated
to "t" before divergence of the 3 ingroup species.
At position 12, the ancestor was most likely "c".

When PAUP generates an "ancestral" sequence, it does
not just give you the outgroup sequence.  I believe
(but I'm not 100% sure) that you can generate such
an ancestral sequence in PAUP using either a maximum
likelyhood, or a maximum parsimony method.

I do not know of any program which allows us to add
truly ancestral sequences (perhaps obtained from fossil
DNA; or in the case of a rapidly evolving species such
as HIV, from frozen blood samples) with dates, in order
to more accurately determine the ancestral states.

If anyone knows of the best method of estimating the
ancestral sequence, please let me know.  I'd like a method that
would not just put in an "n" or "?" at sites which cannot
be determined, but would choose one base or another for those
sites, perhaps siding with the outgroup some percentage of the
time, and with the ingroup the rest, or using protein/codon
information to guestimate.

Another weak pont in this type of analysis is deciding what to
do about inserts/deletions.  In my expereince with PAUP it 
assumes that only dletions are possible.  The "ancestor" had
the longest sequence, with both the outgroup and the ingroup
suffering deletions.  It looks to me like it takes complex
reasoning to decide whether the event was more likely to be
an insertion on one branch or a deletion on the other.

-- 
 ____________________________________________________________________
|Brian T. Foley               btf at t10.lanl.gov                       |
|HIV Database                 (505) 665-1970                         |
|Los Alamos National Lab      http://hiv-web.lanl.gov/index.html     |
|Los Alamos, NM 87544  U.S.A. http://www.t10.lanl.gov/~btf/home.html |
|____________________________________________________________________|




More information about the Mol-evol mailing list