Protein distance/similarity measure

mathog at seqaxp.bio.caltech.edu mathog at seqaxp.bio.caltech.edu
Wed May 20 10:41:37 EST 1998


>-- 	I am in need of a program to calculate pairwise 
>similarity scores between amino acid sequences.  I need the
>score to be in the form of % similarity.

Do you have GCG?  You might try, for instance, OLDDISTANCES, or even the first
(pairwise) phase of PILEUP.

>
>	PIMA is the closest I can find to what I want.  But
>it gives a score that is dependent on the sequence length and
>composition.

You can fix the latter part of this in pretty much all GCG programs by
using the appropriate comparison matrix.  As for length - well, that's
fundamental - any cumulative similarity/identity measurement will go up
with length. OLDDISTANCES lets you correct for this to some extent by
dividing the score by: 

                     1=length of the shorter sequence
                     2=length of the shorter sequence without gaps
                     3=Average length
                     4=Average length without gaps
                     5=Nothing

Regards,

David Mathog
mathog at seqaxp.bio.caltech.edu
Manager, sequence analysis facility, biology division, Caltech 




More information about the Mol-evol mailing list