PROSITE Newsletter Issue Nb. 1.

Amos Bairoch BAIROCH at cgecmu51.bitnet
Mon Nov 20 14:12:00 EST 1989


************************************
* PROSITE ELECTRONIC NEWS BULLETIN *
************************************
Number 1 / November 1989

(C) Amos Bairoch
    Dept. Medical Biochemistry / University of Geneva
    Switzerland
------------------------------------------------------------------------

Scope of this bulletin
----------------------

PROSITE is a dictionary of  protein sites and patterns. It is a compilation of
sites and patterns found in  protein sequences.    Some of these patterns have
been  published  in  the  literature, but  the  majority  have  been developed
specifically to be included in this data collection.

For those of you which are not familiar with the content of PROSITE two sample
pattern entries have been included at the end of this bulletin.

This bulletin will be used to announce new releases, enhancements, application
notes, etc.


Practical details
-----------------

- All mail concerning PROSITE should  be sent to the following EARN/BITNET
  address: PROSITE at CGECMU51

- I apologize to those of you which will receive this issue of the PROSITE
  bulletin in  duplicate (or triplicate):  the first issue is sent through
  the BIONEWS bulletin board  on BIOSCI  as well as to specific electronic
  mail addresses. The next issues will  only be sent to those of you which
  will subscribe to  this bulletin (you can do that by just by replying to
  this message). In the future, if the number of people interested in this
  bulletin is big enough,  it will  probably  make  sense  to create a new
  BIOSCI bboard specific for PROSITE.


Release 4 of PROSITE
--------------------

Release 4 of October 1989 contains 202 sites  and patterns. 42 patterns were
added since release 3 of May 1989) and 59 entries were updated with new data.
The list of new entries is given below.

- Type II fibronectin collagen-binding domain
- 'Trefoil' domain signature
- Chitin recognition or binding domain signature
- Myc-type, 'helix-loop-helix' putative DNA-binding domain signature
- Crp bacterial activator proteins family signature
- Ribosomal protein S11 signature
- Ribosomal protein S18 signature
- Zinc-containing alcohol dehydrogenases signature
- Iron-containing alcohol dehydrogenases signature
- Insect-type alcohol dehydrogenase / ribitol dehydrogenase signature
- Malate dehydrogenase active site signature
- Glucose-6-phosphate dehydrogenase active site
- Ribonucleotide reductase large subunit signature
- Thymidylate synthase active site
- N-6 adenine-specific DNA methylases signature
- Serine hydroxymethyltransferase pyridoxal-phosphate attachment site
- Aspartate and ornithine carbamoyltransferases signature
- Purine/pyrimidine phosphoribosyl transferases signature
- Creatine kinase active site
- Adenylate kinase
- Eukaryotic RNA polymerase II heptapeptide repeat
- Lipases lipid-binding residue signature
- Colipase signature
- Serine/threonine specific protein phosphatases signature
- Cutinase, serine active site
- Xylose isomerase signature
- Phosphoglucose isomerase signature
- Eukaryotic DNA topoisomerase I active site
- Aminoacyl-transfer RNA synthetases 'HIGH' signature
- Thiamine pyrophosphate enzymes signature
- 2-oxo acid dehydrogenases acyltransferase component lipoyl binding site
- Mitochondrial energy transfer proteins signature
- Sugar transport proteins signature
- Insulin-like growth factor binding proteins signature
- Integrins beta chain cysteine-rich domain signature
- Mammalian defensins signature
- Membrane attack complex components / perforin signature
- Heat shock hsp70 proteins family signature
- Heat shock hsp90 proteins family signature
- Vertebrate galactoside-binding lectin signature
- Bacterial ice-nucleation proteins octamer repeat


How to get hold of release 4 of PROSITE
---------------------------------------

Release 4 of PROSITE is available, in a printed form, as an EMBL Biocomputing
Technical Document. If you want to receive a copy of it, please send an email
message to the following BITNET/EARN address: RAULFS at EMBL

Note: except to a few beta test sites,  release 4 of PROSITE is NOT available
on any computer media.

Release 5 of PROSITE
--------------------

Release 5 will be  the first  release to be  distributed along with the SWISS-
PROT protein sequence data bank. There will be cross-references between SWISS-
PROT and PROSITE. Release 5 will be available in January 1990.

Format of release 5
-------------------

The next news bulletin (to be sent in early December) will describe the exact
format used to stored PROSITE.

Format of the cross-references to PROSITE in SWISS-PROT
-------------------------------------------------------

Starting with  release 13 of  SWISS-PROT (January 1990) there will be cross-
references in sequence entries  to  PROSITE patterns.  These cross-references
will be implemented  using the SWISS-PROT DR (Data bank Reference) line-type.
A sample SWISS-PROT entry with such a cross-reference is shown below.

ID   SODM$YEAST     STANDARD;      PRT;   233 AA.
AC   P00447;
DT   21-JUL-1986  (REL. 01, CREATED)
DT   23-OCT-1986  (REL. 02, LAST SEQUENCE UPDATE)
DT   01-JAN-1990  (REL. 13, LAST ANNOTATION UPDATE)
DE   SUPEROXIDE DISMUTASE PRECURSOR (MN) (EC 1.15.1.1).
OS   BAKER'S YEAST (SACCHAROMYCES CEREVISIAE).
OC   EUKARYOTA; FUNGI; ASCOMYCETES.
RN   [1] (SEQUENCE FROM N.A.)
RA   MARRES C.A.M., VAN LOON A.P.G.M., OUDSHOORN P., VAN STEEG H.,
RA   GRIVELL L.A., SLATER E.C.;
RL   EUR. J. BIOCHEM. 147:153-161(1985).
RN   [2] (SEQUENCE OF 27-233)
RA   DITLOW C., JOHANSEN J.T., MARTIN B.M., SVENDSEN I.;
RL   CARLSBERG RES. COMMUN. 47:81-91(1982).
CC   -!- FUNCTION: DESTROY RADICALS WHICH ARE NORMALLY PRODUCED WITHIN
CC       THE CELLS AND ARE TOXIC TO BIOLOGICAL SYSTEMS.
CC   -!- CATALYTIC ACTIVITY: 2 PEROXIDE RADICAL + 2 H(+) = O(2) + H(2)O(2).
CC   -!- SUBUNIT: TETRAMER OF IDENTICAL CHAINS.
CC   -!- SUBCELLULAR LOCATION: MITOCHONDRIAL MATRIX.
CC   -!- EUKARYOTIC CELLS CONTAIN A MITOCHONDRIAL MN-CONTAINING ENZYME
CC       AND A CYTOPLASMIC CU-ZN-CONTAINING ENZYME.
CC   -!- SIMILARITY: STRONG TO IRON SUPEROXIDE DISMUTASES.
DR   PIR; A00521; DSBYN.
DR   EMBL; X02156; SCSODMNG.
**
DR   PROSITE; PS00123; SOD-MN.
**
KW   OXIDOREDUCTASE; MANGANESE; MITOCHONDRION; TRANSIT PEPTIDE.
FT   TRANSIT       1     26       MITOCHONDRION.
FT   CHAIN        27    233       SUPEROXIDE DISMUTASE, MANGANESE.
FT   METAL        56     56       MANGANESE LIGAND (BY HOMOLOGY).
FT   METAL       107    107       MANGANESE LIGAND (BY HOMOLOGY).
FT   METAL       194    194       MANGANESE LIGAND (BY HOMOLOGY).
FT   METAL       198    198       MANGANESE LIGAND (BY HOMOLOGY).
SQ   SEQUENCE   233 AA;  25774 MW;  273501 CN;
     MFAKTAAANL TKKGGLSLLS TTARRTKVTL PDLKWDFGAL EPYISGQINE LHYTKHHQTY
     VNGFNTAVDQ FQELSDLLAK EPSPANARKM IAIQQNIKFH GGGFTNHCLF WENLAPESQG
     GGEPPTGALA KAIDEQFGSL DELIKLTNTK LAGVQGSGWA FIVKNLSNGG KLDVVQTYNQ
     DTVTGPLVPL VAIDAWEHAY YLQYQNKKAD YFKAIWNVVN WKEASRRFDA GKI
//

Two sample PROSITE entries
--------------------------

********************************************
* Endoplasmic reticulum targeting sequence *
********************************************

Proteins  that permanently reside in the  lumen  of  the endoplasmic reticulum
(ER) seem to be distinguished from newly synthesized secretory proteins by the
presence of the C-terminal sequence Lys-Asp-Glu-Leu (KDEL) [1].  It seems that
proteins bearing  the  KDEL signal  are  not  simply  held  in the ER, but are
selectively retrieved from a post-ER compartment  and returned to their normal
location.  In yeast, the ER sorting system recognize the sequence HDEL instead
of KDEL [2]. The currently known ER luminal proteins are listed below.

 - Protein disulphide-isomerase (PDI) (which is also known as the beta-subunit
   of prolyl 4-hydroxylase, as a  component  of oligosaccharyl transferase, as
   glutathione-insulin  transhydrogenase  and  as  a  thyroid  hormone binding
   protein).
 - The hsp70 related  protein  GRP78  (also known as  the immunoglobulin heavy
   chain binding protein (BiP), and as KAR2, in yeast).
 - The hsp90 related  protein 'endoplasmin'  (also  known as  GRP94, Erp99, or
   Hsp108).
 - Calreticulin [3], a calcium-binding protein (also known as calregulin, CRP55
   or HACBP).
 - The receptor for the plant hormone auxin [4].

-Consensus pattern: (K,H)-D-E-L>
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT:  cholera toxin  A  chain precursor,
 and a hypothetical phage T4 protein.
-Last update: December 1989 / Text revised.

[ 1] Munro S., Pelham H.R.B.
     Cell 48:899-907(1987).
[ 2] Pelham H.R.B., Hardwick K.G., Lewis M.J.
     EMBO J. 7:1757-1762(1988).
[ 3] Smith M.J., Koch G.L.E.
     EMBO J. 8:3581-3586(1989).
[ 4] Hess T., Feldwisch J., Baalshusemann D.,Bauw G., Puype M.,
     Vandekerckhove J., Lobler M., Klambt D., Schell J., Palme K.
     EMBO J. 8:2453-2461(1989).


*****************************************************
* Serine proteases, subtilisin family, active sites *
*****************************************************

Subtilisins [1] (EC 3.4.21.14) are bacterial  alkaline  serine proteases whose
catalytic activity is provided by a charge relay system similar to that of the
trypsin family of serine proteases but which evolved by independent convergent
evolution. The sequence around the catalytic serine and histidine residues are
completely different from that of the analogous residues in the trypsin serine
proteases and can be used as signatures specific to that category of proteases.
The subtilisin  family currently  includes,  in  addition  to  subtilisin, the
following proteases:

       - Fungal proteinase K [2].
       - Kluyveromyces lactis Kex-1 protease [3] and yeast Kex-2 protease [4].
       - Serratia serine protease [5].
       - Thermoactinomyces vulgaris thermitase [6].
       - Thermus aquaticus aqualysins [7].
       - Yarrowia lipolytica alkaline extracellular protease (AEP) [8].
       - Yeast protease B [9].
       - Bacillus subtilis major intracellular serine protease ISP-1 [10].
       - Aspergillus oryzae alkaline proteinase B [11].

-Consensus pattern: G-T-S-x-A-x-P-x2-(A,S,T,V)-G
                    [S is the active site]
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT: NONE.

-Consensus pattern: H-G-T-x2-(A,S,T)-G-x-(I,L,V,A)
                    [H is the active site]
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT: NONE.

-Last update: December 1989 / Text revised.

[ 1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter,
     Berlin New-York (1988).
[ 2] Jany K.-D., Mayer B.
     Biol. Chem. Hoppe-Seyler 366:485-492(1985).
[ 3] Tanguy-Rougeau C., Wesolowski-Louvel M., Fukuhara H.
     FEBS Lett. 234:464-470(1988).
[ 4] Mizuno K., Nakamura T., Ohshima T., Tanaka S., Matsuo H.
     Biochem. Biophys. Res. Commun. 156:246-254(1988).
[ 5] Yanagida N., Uozumi T., Beppu T.
     J. Bacteriol. 166:937-944(1986).
[ 6] Meloun B., Baudys M., Kostka V., Hausdorf G., Frommel C., Hohne W.E
     FEBS Lett. 183:195-200(1985).
[ 7] Kwon S.-T., Terada I., Matsuzawa H., Ohta T.
     Eur. J. Biochem. 173:491-497(1988).
[ 8] Davidow L.S., O'Donnell M.M., Kaczmarek F.S., Pereira D.A., Dezeew J.R.,
     Franke A.E.
     J. Bacteriol. 169:4621-4629(1987).
[ 9] Moehle C.M., Tizard R., Lemmon S.K., Smart J., Jones E.W.
     Mol. Cell. Biol. 7:4390-4399(1987).
[10] Koide Y., Nakamura A., Uozumi T., Beppu T.
     J. Bacteriol. 167:110-116(1986).
[11] Tatsumi H., Ohsawa M., Tsuji R.F., Murakami S., Nakano E., Motai H.,
     Masaki A., Ishida Y., Murakami K., Kawabe H., Arimura H.
     Agric. Biol. Chem. 52:1887-1888(1988).

----End-of bulletin----------------------------------------------------------





More information about the Bioforum mailing list