? about DNA alignment

Frank Wright mbfw at s-crim1.dl.ac.uk
Thu Jun 24 10:25:40 EST 1993


Jeremy John Ahouse (ahouse at hydra.rose.brandeis.edu) asked.....

>    I have done a series of multiple alignments.  The alignments were done
> with inferred amino acid sequences.  Now that I am happy with the
> alignments I want to go back to the mRNA sequence (which I have) for some
> of the clustering and parsimony analysis.  I want to enforce the alignments
> (gaps, etc...) from the aa's on the nucleotide alignment.

Several people (Steve Thompson, Fernando Gonzales, Craig Marshall, and Bill
Pearson) have offered or suggested software to produce a nucleic acid
alignment based on the alignment of the protein sequences.

Steve Thompson also suggested actually constructing the phylogenetic tree
from the protein sequence alignment (using the PHYLIP package).  This seems
like a very good idea, if most of the phylogenetic information is contained
in the first two codon positions.  This would be the case if the sequences
had diverged considerably - in such a case the third codon position would be 
saturated and would contain mainly moise.  The fact that the sequences were
aligned using the protein sequences rather than the mRNA sequences suggests
that this may be the case.

  Using the mRNA sequences to construct the tree is also okay, as long as
you check that the relative rates of change at the three codon positions is
not markedly different.  Most phylogenetic tree construction methods assume
that the evolutionary rate does *not* vary along the sequence.  One could try
constructing a tree based on only codon positions 1 and 2 (e.g. by using
the WEIGHTS option in the PHYLIP package to restrict the analysis to codons 
1 and 2) and compare it with the tree based on the third codon position
(similarly using the WEIGHTS option in PHYLIP).  If the third position is
indeed saturated, then a tree based on the third codon position is unlikely
to have significant structure.

 The WEIGHTS option in the PHYLIP package is very useful.  By running three
separate DNADIST analysis on the three codon positions, one can find out
how much change has occurred at each codon position.

Frank Wright
SASS, University of Edinburgh,
J.C.M.B. 3610, Kings Buildings,
Edinburgh EH9 3JZ, Scotland, U.K.

frank at sass.sari.ac.uk




More information about the Bio-soft mailing list