seg and xnu: scoring ambiguity-code matches in BLASTP

Peter Stockwell peter at sanger.otago.ac.nz
Tue Sep 12 15:29:28 EST 1995


Ken Wolfe (khwolfe at tcd.ie) wrote:

> The BLAST low-complexity filters seg and xnu change amino acids in the
> Query sequence into X characters.  When these sequences are searched
> (BLASTP) against a database, the Query sequence no longer hits itself with
> 100% identity because matches involving X are counted as mismatches.  Is
> there a way of overcoming this so that filtered sequences still have 100%
> identity to themselves?

I'm not sure I can see why this would be desirable: the purpose of
filtering the query sequence in the first place is to suppress motifs
and repetitive sequences which lead to such frequent hits in the
database that any analysis of the remaining sequence is rendered
impossible.  Re-enabling counting of suppressed regions would lead to
a reoccurrence of this problem.

Peter A. Stockwell

> For example: yeast TUP1 after seg-ing hits itself with only 81% identity:

> SWISS|TUP1_YEAST|P16649 
>   Length = 713

>  Score = 2930 (1318.2 bits), Expect = 0.0, P = 0.0
>  Identities = 581/713 (81%), Positives = 581/713 (81%)
[...]
> Any ideas?

> -- 
> Ken Wolfe
> Department of Genetics
> University of Dublin                    e-mail: khwolfe at tcd.ie
> Trinity College                         phone:  +353-1-608-1253
> Dublin 2, Ireland                       FAX:    +353-1-679-8558




More information about the Bio-soft mailing list