unBLASTable sequence?

Andrew Dalke dalke at bioreason.com
Wed Aug 4 12:37:55 EST 1999


Francois Jeanmougin <pingouin at crystal.u-strasbg.fr>
>      It is filtered due to its low complexity. Removing the
> filters as suggested by Andrew will "flood" your output with lot of
> poly-prolines sequences, with no biological means nor informations.

In many cases, yes, but there have been a few times where I
didn't get any hits with filtering turned on, and some, about
5 or so, hits with filtering turned off.

As for the *relevancy* of the results, that's a whole 'nother
issue :)  For example -- example since I don't remember the
details of the original circumstances -- suppose you are looking
for some somewhat homologous sequence in the PDB so you can get
a couple of templates to mutate for the construction of a structure
prediction.  All you want are some rough ideas to use as the basis
of the prediction.

Filtering out low-complexity sequences could make it so you have
no hits, whereas without a filter you might get some hits and
be able to use the corresponding structure as a template.  Of
course, odds are that the structure will be random coil, but I'll
bet that some of the low-complexity sequences have a pretty well
defined structure; and I'll bet that SEG doesn't take structural
aspects into account.

But I could be wrong.  Hmm, there's a project for someone :)


						Andrew Dalke
						dalke at bioreason.com




More information about the Bio-soft mailing list