Making alignments

James McInerney jamm at nhm.ac.uk
Fri Jan 16 12:15:24 EST 1998


Jon,


It is an absolute MUST to use the amino acid sequences for the purposes of
alignment.  If you use DNA sequences there is the prospect of introducing a
single gap or two gaps, when in all probability any indels that survived
selective purification would be multiples of three nucleotides (all others
would result in frameshift mutations which can be very unusual customers).

Alignment at the amino acid level has the benefit of being able to use
substitution probability models that are based on known likelihoods of
substitution.  We know, for instance that the substitution of an aromatic
amino acid for another aromatic amino acid is a much more likely event than a
substitution for a cysteine (say).  For DNA sequences, we might say that
transition substitutions are more likely than transversions and then choose to
weight the alignment accordingly, but most of the time, we cannot come up with
anything more complicated than that, with the result that parts of the
alignment that have a lot of length variation can be aligned very arbitrarily.

That is not to say that amino acid alignment does not have similar problems. 
Of course it does, but I'm sure it's safe to say that they are significantly reduced.

If you want to semi-automate this task, get a copy of clustalw (ftp.ebi.ac.uk)
and you can get two programs from the Natural History Museum ftp site
(ftp://ftp.nhm.ac.uk/pub/gcua) called translfas and putgaps, which will
translate the DNA sequences (in fasta format) and then after alignment you can
use putgaps to insert gaps in the dna sequences according to where they are
found in the amino acid alignment.  It's a bit of a work-around.


my 0.02p-worth.

James



Dr. J.P. Clewley wrote:
> 
> What are the pros and cons of using in-frame nucleotide sequence alignments
> comapred with using non in-frame nucleotide alignments for making
> phylogenetic trees?
> 

 
=========
James O. McInerney               email: J.mcinerney at nhm.ac.uk
Molec. Biol. Comput. Officer,    phone: +44 171 938 9247
Department of Zoology,           Fax:   +44 171 938 9158
The Natural History Museum,
Cromwell Road,                    
London SW7 5BD.                  
=========




More information about the Mol-evol mailing list