seg and xnu: scoring ambiguity-code matches in BLASTP

Ken Wolfe khwolfe at tcd.ie
Tue Sep 12 06:29:42 EST 1995


The BLAST low-complexity filters seg and xnu change amino acids in the
Query sequence into X characters.  When these sequences are searched
(BLASTP) against a database, the Query sequence no longer hits itself with
100% identity because matches involving X are counted as mismatches.  Is
there a way of overcoming this so that filtered sequences still have 100%
identity to themselves?

For example: yeast TUP1 after seg-ing hits itself with only 81% identity:

SWISS|TUP1_YEAST|P16649 
  Length = 713

 Score = 2930 (1318.2 bits), Expect = 0.0, P = 0.0
 Identities = 581/713 (81%), Positives = 581/713 (81%)

Query:     1 MTASVSNTQNKLNELLDAIRQEFLQVSQEANTYRLQNQKDYDFKMNQQLAEMQQIRNTVY 60
             MTASVSNTQNKLNELLDAIRQEFLQVSQEANTYRLQNQKDYDFKMNQQLAEMQQIRNTVY
Sbjct:     1 MTASVSNTQNKLNELLDAIRQEFLQVSQEANTYRLQNQKDYDFKMNQQLAEMQQIRNTVY 60

Query:    61 ELELTHRKMKDAYEEEIKHLKLGLEQRDHQIXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 120
             ELELTHRKMKDAYEEEIKHLKLGLEQRDHQI                             
Sbjct:    61 ELELTHRKMKDAYEEEIKHLKLGLEQRDHQIASLTVQQQRQQQQQQQVQQHLQQQQQQLA 120

Query:   121 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXFPVQASRPNLVGSQLPTTTLPVVSSNA 180
                                              FPVQASRPNLVGSQLPTTTLPVVSSNA
Sbjct:   121 AASASVPVAQQPPATTSATATPAANTTTGSPSAFPVQASRPNLVGSQLPTTTLPVVSSNA 180

  etc. etc.

Any ideas?

-- 
Ken Wolfe
Department of Genetics
University of Dublin                    e-mail: khwolfe at tcd.ie
Trinity College                         phone:  +353-1-608-1253
Dublin 2, Ireland                       FAX:    +353-1-679-8558




More information about the Bio-soft mailing list