>>>O.K. this is a really simple one. I need to calculate the % identity of
>>>some peptides (10 to 19 residues) to a much longer sequence (499 aa). If
>>>the peptide is 10 aa long and has 3 exact matches do I divide by 10 (the
>>>length of potential identity for peptide) or divide by the total length
>>>of the protein. I personally vote for the first calculation but I've met
>>>resistance from my fellow lab mates. Anyone care to clear up our debate?
>>
>> You divide by the length that the sequences are aligned over:
>>
>>
>>    ARCDFRSGQSRL
>>DEFGAKSDEHTPQTKVFTS    Alignment is 10 residues long Seqn ID=3/10=30%
>>    *  *    *
>>
>>    ARCD--KVSGQSRL     Alignment is 12 residues long Seqn ID=3/12=25%
>>DEFGAKSDGQRMTRQWKVMK
>>    *  *      *
>>
>>    ARCDKVSGQSRL       Alignment is 10 residues long Seqn ID=3/10=30%
>>DFGAAKSD---AQTKSHTRE
>>    *  *    *
>>
>The length that sequences are aligned over isn't always obvious in a
>global alignment, is it?  Sometimes you may want to consider stretches
>of residues at the ends, sometimes you don't.  I think the choice of
>which peptide length to divide by depends on your protein of interest:
>If you're looking to compare the relative homologies of a set of
>proteins to protein A, divide by protein A's length.
>
>
>Al Wang
>http:www.seas.upenn.edu/~alwang
>
>
>
In a gapped alignment count also the length of the gap

