# % IDENTITY QUESTION

Al Wang alwang at NOdoubtSPAM.com
Sun Jul 26 08:34:37 EST 1998

```On 23 Jul 1998 21:44:41 GMT, tpwalsh at acer.gen.tcd.ie (Tom Walsh)
wrote:

>In article <35B7B542.2A3D5E5B at biochem.purdue.edu>,
>Chris Sinclair  <sinclair at biochem.purdue.edu> wrote:
>>O.K. this is a really simple one. I need to calculate the % identity of
>>some peptides (10 to 19 residues) to a much longer sequence (499 aa). If
>>the peptide is 10 aa long and has 3 exact matches do I divide by 10 (the
>>length of potential identity for peptide) or divide by the total length
>>of the protein. I personally vote for the first calculation but I've met
>>resistance from my fellow lab mates. Anyone care to clear up our debate?
>
> You divide by the length that the sequences are aligned over:
>
>
>    ARCDFRSGQSRL
>DEFGAKSDEHTPQTKVFTS    Alignment is 10 residues long Seqn ID=3/10=30%
>    *  *    *
>
>    ARCD--KVSGQSRL     Alignment is 12 residues long Seqn ID=3/12=25%
>DEFGAKSDGQRMTRQWKVMK
>    *  *      *
>
>    ARCDKVSGQSRL       Alignment is 10 residues long Seqn ID=3/10=30%
>DFGAAKSD---AQTKSHTRE
>    *  *    *
>
The length that sequences are aligned over isn't always obvious in a
global alignment, is it?  Sometimes you may want to consider stretches
of residues at the ends, sometimes you don't.  I think the choice of
which peptide length to divide by depends on your protein of interest:
If you're looking to compare the relative homologies of a set of
proteins to protein A, divide by protein A's length.

Al Wang
http:www.seas.upenn.edu/~alwang