ProSite Database
Lee F Kolakowski
lfk at athena.mit.edu
Thu Jul 12 18:04:46 EST 1990
I have written some software that runs under Unix(tm) or MSDOS, that
will search a protein sequence for all of the patterns described in
Amos Baroich's Prosite database.
It requires the following:
AWK (new awk or Gnu GAWK, GAWK is free, ftp'able software)
The prosite database file prosite.doc
Basically, I have translated all the patterns to Unix style regular
expressions, and created some simple awk scripts to search protein
sequences.
NOTE: the MKS (Mortice Kern Systems) version of AWK for MSDOS machines
is required for MSDOS use. The MSDOS version of GAWK can't handle 300+
regular expressions. I have no links with MKS except that it is a good
product.
I would like to post a shell archive of this to bionet.software, if
there is interest.
Here is a brief example of the output (short form):
Prosite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch
ProSearch Software -- Release 0.1beta -- Copyright: Lee Kolakowski
The following patterns are in < bov.ops >:
Access# From->To Name
_______ ________ ____
PS00001 2->6 ASN_GLYCOSYLATION
PS00001 16->20 ASN_GLYCOSYLATION
PS00001 201->205 ASN_GLYCOSYLATION
PS00005 14->17 PKC_PHOSPHO_SITE
PS00005 230->233 PKC_PHOSPHO_SITE
PS00005 244->247 PKC_PHOSPHO_SITE
PS00006 22->26 CK2_PHOSPHO_SITE
PS00006 194->198 CK2_PHOSPHO_SITE
PS00006 199->203 CK2_PHOSPHO_SITE
PS00006 230->234 CK2_PHOSPHO_SITE
PS00006 339->343 CK2_PHOSPHO_SITE
PS00007 21->30 TYR_PHOSPHO_SITE
PS00008 89->95 MYRISTYL
PS00008 121->127 MYRISTYL
PS00008 157->163 MYRISTYL
PS00008 183->189 MYRISTYL
PS00013 157->168 PROKAR_LIPOPROTEIN
PS00237 68->85 G_PROTEIN_RECEPTOR
PS00238 296->314 OPSIN
Please e-mail me if you are interested.
--
Frank Kolakowski
======================================================================
|lfk at athena.mit.edu || Lee F. Kolakowski |
|lfk at eastman2.mit.edu || M.I.T. |
|kolakowski at wccf.mit.edu || Dept of Chemistry |
|lfk at mbio.med.upenn.edu || Room 18-506 |
|lfk at hx.lcs.mit.edu || 77 Massachusetts Ave.|
|AT&T: 1-617-253-1866 || Cambridge, MA 02139 |
|--------------------------------------------------------------------|
| #include <woes.h> |
| One-Liner Here! |
======================================================================
More information about the Bio-soft
mailing list