significance of alignments

Bill Pearson wrp at cyclops.micr.Virginia.EDU
Wed Apr 21 18:57:46 EST 1993

In article <9304210131.AA08023 at spider.ento.csiro.au> lizvanp at ento.csiro.au writes:
>I have an alignment between two proteins which covers a region of about 100
>residues. The proteins I am comparing are 200 and 240 residues in size, and
>the region that aligns has 31% identity, but only 20% similarity (according
>to MaxHom, EMBL at Heidelberg). Also the region that aligns contains a
>residue which is involved in the active site, but one of these proteins has
>no alignment in this particular area. Can anyone give me some idea whether
>this level of similarity has any real meaning?  Cheers Lis

	31% identity over 100 residues seems likely to be significant.
What is the local similarity (PAM250) score?  It is unclear why you
would have lower similarity than identity; the program that calculates
similarity must be requiring a "global" alignment - one that extends
from end-to-end.  This may not be appropriate.

	To test the significance of the similarity score, you could
use the "rss" program, which compares the two sequences using the
Smith-Waterman algorithm, and then shuffles one of the sequences and
generates similarity scores for the randomized sequence.  "rss" is a
derivative of "rdf2," which was described in Pearson and Lipman (1988)
PNAS 85:2444.  It is available with the fasta package, which you can
obtain (for unix and VMS machines) from uvaarpa.virginia.EDU:pub/fasta.

Bill Pearson

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net