# Kimura alignment

Ronald DeBry histone at acpub.duke.edu
Tue Feb 7 07:34:49 EST 1995

```Even if you are basing your alignment on a Jukes-Cantor model, the
alignment procedure requires choosing among different gap placements.
Consider the following fragments from a larger alignment.

a  ATG--AGATG       a  ATGAG--ATG
b  ATG--AGATA       b  ATGAG--ATA
c  ATGACAGATC       c  ATGACAGATC
d  ATGACAGATC       d  ATGACAGATC
e  ATGAGAAATC       e  ATGAGAAATC
f  ATGAGAAATC       f  ATGAGAAATC

If the true tree were (a,b),(c,d) the first alignment would make more
sense, but if instead (a,b),(e,f) the second alignment might be better.

The generation of insertions and deletions is an evolutionary process
that happens "on the tree" as it were.

>> = me
>  = Mark Siddall

>>of solutions have been proposed.  One can use an iterative reciprocal
>>process, where a trial alignment is used to give a trial phylogeny,

>Ah.  But will this not bias your final answer towards some tree(s)
>or islands of trees.

Quite possible.  I didn't say it was a perfect solution.  But, doesn't
picking an arbitrary alignment without reference to a tree also bias the
result?  If a different but equally valid alignment gives a different
most parsimonious tree, how do you propose to choose among them?

>>alternative that I know of is a series of papers by Jeff Thorne.  In his
>>method (essentially) all possible alignments are generated.  Each

>ALL????  I'm afraid I like to get alignment answers within the week!

Three points on this:

1) It's not quite as bad as you might be thinking. As I recall, Thorne
proposes examining only pairwise alignments, not all possible multiple
alignments.

2) It is quite possible that this is an approach whose time has not
yet arrived, due to computer limitations.  Given the rate at which even
PCs are improving, it might not be as long as you think before such
computationally demanding methods are practical.

3) How long did you spend generating the data? (including writing the
grant, collecting the specimens, generating the sequences)  Once you
have the alignment, how long until the paper is published?  Why limit
this one phase to a week, especially if the computer is doing the actual
work?

>>As someone who subscribes to the idea that phylogenetic inference is a
>>statistical problem and that maximum likelihood in some form is the best

>I don't but that too is another issue.

Which part of my statement do you disagree with?  Is phylogenetics not
statistical inference?  Or is it an inference problem but you think that
parsimony is a better estimator?

Ron

histone at acpub.duke.edu

```