SWISS-PROT release 40 available

Elisabeth Gasteiger Elisabeth.Gasteiger at isb-sib.ch
Mon Oct 29 20:41:43 EST 2001


(I) DATABASE AVAILABILITY ANNOUNCEMENT


Name        : SWISS-PROT 
Description : Protein sequence database.
Release     : 40.0 of October 2001
Statistics  : 101'602 fully annotated sequences , 37'315'215 amino
acids,
              91'880 references.
Citation    : Bairoch A., Apweiler R.
              The SWISS-PROT protein sequence database and its
supplement
              TrEMBL in 2000.
              Nucleic Acids Res. 28:45-48(2000).
Availability: FTP: ftp://ftp.expasy.org/databases/swiss-prot
                   ftp://ftp.ebi.ac.uk/pub/databases/swissprot
              WWW: http://www.expasy.org/sprot/
                   http://www.ebi.ac.uk/swissprot/


Name        : ENZYME
Description : Enzyme nomenclature database.
Release     : 27.0 of October 2001
Statistics  : 3'870 enzymes described.
Citation    : Bairoch A.
              The ENZYME database in 2000.
              Nucleic Acids Res. 28:304-305(2000).
Availability: FTP: ftp://ftp.expasy.org/databases/enzyme
                   ftp://ftp.ebi.ac.uk/pub/databases/enzyme
              WWW: http://www.expasy.org/enzyme/

---------------------------------------------------------------------

(II) SUMMARY OF CURRENT CHANGES AND FUTURE DEVELOPMENTS IN SWISS-PROT,
     PROSITE AND ENZYME

Note: a much more  complete  description  of  the  changes  and  future
developments that are listed below is available from the release notes.
The release notes can be accessed from the WWW at the address:

            http://www.expasy.org/sprot/relnotes/

or downloaded by FTP from:

     ftp://ftp.expasy.org/databases/swiss-prot/release/relnotes.txt
    ftp://ftp.ebi.ac.uk/pub/databases/swissprot/release/relnotes.txt


A) Summary of the changes in SWISS-PROT release 40 and ENZYME release
27.

In SWISS-PROT:

- The name of the database changed from 'SWISS-PROT protein sequence
  database' to 'SWISS-PROT knowledgebase' to emphasize the fact that
  SWISS-PROT collects, by far, more than just information on protein
  sequences and that it is a central linking and linked database which
  connects the various findings in the diverse fields of proteomics
  research.
- Release 40.0 of SWISS-PROT contains 101'602 sequence entries,
comprising
  37'315'215 amino acids abstracted from 91'880 references. 15'184
  sequences have been added since release 39, the sequence data of 2'908
  existing entries has been updated and the annotations of 44'684
entries
  have been revised. With this release SWISS-PROT has passed the
symbolic
  mark of 100 thousand entries.
- In order to handle the large amount of "raw" data coming from the
  microbial genomic sequencing, the High quality Automated Microbial
  Annotation of Proteomes (HAMAP) project was initiated. The latter aims
  to automatically annotate a significant percentage of proteins which
  originate from microbial genome sequencing projects. See:
        http://www.expasy.org/sprot/hamap/
- The Human Proteomics Initiative (HPI) project is progressing. There
are
  currently 7'471 annotated human sequences in SWISS-PROT. These entries
  are associated with 19'922 literature references, 18'974 experimental
or
  predicted PTM's, 1'697 splice variants and 12'061 polymorphisms. See:
        http://www.expasy.org/sprot/hpi/
- There can now be more than one AC (ACcession) line per SWISS-PROT
entry.
- The OX (Organism taXonomy cross-reference) line has been introduced to
  indicate the identifier to a specific organism in a taxonomic
database.
- The RX line format changed, and it now provides identifiers not only
to
  Medline but also to PubMed.
- We have introduced two new 'topics' for the comments (CC) line type:
   o The topic 'BIOTECHNOLOGY' has been introduced to describe the use
     of a specific protein in the biotechnological industry.
   o The topic 'PHARMACEUTICAL' has been introduced to describe the use
     of a specific protein as a pharmaceutical drug.
- We are continuing a major overhaul of various comment line topics. A
  special effort has been done in making the ALTERNATIVE PRODUCTS and
  SIMILARITY topics more standardized.
- We have added cross-references from SWISS-PROT to the following
  databases: ANU-2DPAGE, COMPLUYEAST-2DPAGE, GlycoSuiteDB, Leproma,
  MEROPS, MypuList, PHCI-2DPAGE, PMMA-2DPAGE, ProDom, Siena-2DPAGE and
  SMART.
- A new FT key: SE_CYS was introduced to describe selenocysteine
residues.
- We have introduced feature identifiers (FTId) to the feature keys
  CARBOHYD and VARIANT. These stable feature identifiers allow to
  construct links directly from position-specific annotation in the
  feature table to specialized protein-related databases.
- We are gradually converting SWISS-PROT entries from all 'UPPER CASE'
to
  'MiXeD CaSe'. The line-types that have been converted between release
  38 and 40 are: DE (DEscription), most RC (Reference Comment) topics
  (SPECIES, TISSUE, PLASMID and TRANSPOSON) and DR (Database cross-
  Reference). The new OX line and the new CC topics PHARMACEUTICAL and
  BIOTECHNOLOGY have been introduced in mixed case. The CC topic MASS
  SPECTROMETRY has been converted to mixed case.
- The SQ line syntax was changed to replace the 32-bit CRC (Cyclic
  Redundancy Check) value by a 64-bit CRC.
- Many new documents were introduced such as DBXREF.TXT (list of
  databases cross-referenced in SWISS-PROT), HUMCHR01.TXT to
HUMCHR15.TXT,
  INTEIN.TXT (Index of intein-containing entries referenced in
SWISS-PROT),
  or PLASMID.TXT (List of plasmids).
- The ExPASy WWW server was the target of many improvements that are all
  described at the address: http://www.expasy.org/history.html

In ENZYME:
 
- Many new enzymes were added to the database. The description of many
  more was updated.

In PROSITE:

- A new release of PROSITE will be announced in a few weeks.


B) Future developments

Here is what was announced as planned changes for release 41:

- We are planning to elongate the mnemonic code for the protein name
  in the ID line from up to 4 characters to up to 5 characters.
- Starting with release 41, there can be more than one RP (Reference
  Position) line per reference in a SWISS-PROT entry.
- We are in the process of cleaning up the CC comments topics PATHWAY
  and COFACTOR.
- Conversion to mixed case will continue and will affect the GN (Gene
  Name) line, the RC (Reference Comment) line topic STRAIN, and the
  CC (Comment) line topics CATALYTIC ACTIVITY and PATHWAY.

Of course the above list is far from being definitive, we await your
suggestions!

----------------------------------------------------------------------------
SWISS-PROT is copyright.  It is produced through a collaboration between
the
Swiss Institute  of  Bioinformatics   and the EMBL Outstation - the
European
Bioinformatics Institute. There are no restrictions on its use by
non-profit
institutions as long as its  content is in no way modified. Usage by and
for
commercial entities requires a license agreement.  For information about
the
licensing  scheme  see: http://www.isb-sib.ch/announce/ or send  an
email to
license at isb-sib.ch.
----------------------------------------------------------------------------
-- 
-----------------------------------------------------------------
Elisabeth Gasteiger
Swiss Institute of Bioinformatics
CMU - 1, rue Michel Servet                Tel. (+41 22) 702 58 75
CH - 1211 Geneva 4 Switzerland            Fax  (+41 22) 702 58 58
Elisabeth.Gasteiger at isb-sib.ch            http://www.expasy.org/ 
-----------------------------------------------------------------

---




More information about the Comp-bio mailing list