distance between amino acids and sequences

Jerry Learn learn at u.washington.edu
Fri Apr 16 20:51:34 EST 1999


[[ This message was both posted and mailed: see
   the "To," "Cc," and "Newsgroups" headers for details. ]]

In article <3716644A.19B at lanl.gov>, Brian Foley <btf at lanl.gov> wrote:

> Lev Zhivotovsky wrote:
> > 
> >...
> > Is there any computer program to get the distance matrix 
> > for a set of amino acid sequences ?
> >...
> 
>         The PHYLIP programs, such as PROTDIST do this.
> 
> http://evolution.genetics.washington.edu/phylip/
> 
>         You can choose from different distance measurements
> such as PAM250 or Chemical Categories.
> 
>         The HIV Database has tools for calculating synonynous
> vs nonsynonymous mutation rates at:
> 
> http://hiv-web.lanl.gov/SNAP/WEBSNAP/SNAP.html
> 
> The syn:nonsyn ratio can give an idea of the selective
> pressure on a coding region of DNA.  If all the observed
> mutations are silent, it indicates that the amino acid
> sequence must be conserved in order to maintain function.
> If there is a high rate of nonsyn (amino acid changing)
> mutations, it indicates the protein is being selected for
> change, such as the HIV envelope protein mutating to evade 
> the host immune system.
> 
>         Often times, DNA distances and protein distances
> are not the same.  So it is wise to look at both, if you are
> making phylogenetic inferences.

Brian,

I'm afraid that I must disagree here. Unless you are seeing substantial
saturation at the nucleotide level, nucleotide sequence data are most
likely better for inferring phylogenetic relationships among sequences
(perhaps among species, but that is a different issue, isn't it?). I'm
afraid that I can't say precisely what level of saturation you have to
be at to get misleading results. Maybe someone out there can add this,
but I suspect that this depends on how rich a sample of taxa one is
dealing with. 

If one gets different answers from protein and nucleotide data, that
can tell you that something interesting is going on. My problem with
protein distance is that I have doubts about how well it calibrates
with time. Regardless of how much faith one wants to put in a molecular
clock, at least the models dealing with nucleotide evolution scale
better with time.

Jerry Learn

Research Associate

Health Sci. Ctr., Rm. K443-C      |
Dept. of Microbiology             | Learn at u.washington.edu
University of Washington          | Phone: (206) 616-4286
Box 357740                        |   FAX: (206) 616-1575     
Seattle, WA  98195-7740  USA      |




More information about the Bio-soft mailing list