Regular homology searches
walker at ncbi.nlm.nih.gov
Fri Jul 10 14:57:32 EST 1998
Juergen Pleiss wrote:
> We want to regularily search sequence databases for proteins which are homologous to a given target.
This is easy to do with a shell script using the SEALS package
Many variations are possible on the following script, which
(starting with a fasta library of your favorite sequences) uses
PSI-BLAST to find new homologs, which are mailed to you and added
to the library. The latest additions are kept in a file with a
name based on the input file, suffixed with '_new'.
What is even more interesting is that by using the 'tax_filt' command,
you can limit or sort your new additions by any taxonomic node.
If you also want software to keep your databases updated, this
functionality is about to be added to SEALS (two releases from now).
Email me with any questions.
# Notify me when new sequences are added to the
# database that match into my list of favorites.
# Suitable for use as a cron job.
# Invoke like this
# newhits favorites.fa
if test ! -s $1; then
echo ' '
echo " $0: input file $1 is empty or does not exist"
echo ' '
splishpgp nr $1 -proc= smart -psi= 3 | blast2gi -pcut= .001 | \
gi2fasta | fanot $1 > $1_new
if test -s $1_new; then
cat $1_new >> $1
mailme New members for $1 can be found in $1_new < $1_new
# else mailme No new members for $1 today # uncomment if you like
> Dr.Juergen Pleiss
> Institute of Technical Biochemistry
> University of Stuttgart Email:jpleiss at tebio1.biologie.uni-stuttgart.de
> Allmandring 31 Phone:(+49)-711-685-3191
> D-70569 Stuttgart, Germany Fax: (+49)-711-685-3196
> W3 home page: http://www.itb.uni-stuttgart.de:8080/
More information about the Bio-soft