Announcement of the availability of the ASN.1 version of SWISS-PROT.

Amos Bairoch BAIROCH at cmu.unige.ch
Thu Feb 18 01:55:25 EST 1993


Thanks to the staff of the NCBI and more specifically to Mark Cavanaugh, a
version of SWISS-PROT in the ASN.1 notation is now publicly available from
the following anonymous FTP servers:

    Organiz. : National Center for Biotechnology Information (NCBI)
    Address  : ncbi.nlm.nih.gov (or 130.14.20.1)
    Directory: /repository/swiss-prot/asn

    Organiz. : European MolecularBiology Laboratory (EMBL).
    Address  : ftp.embl-heidelberg.de (or 192.54.41.33)
    Directory: /pub/databases/swissprot/asn

    Organiz. : Basel Biozentrum Biocomputing server (EMBnet SWISS node)
    Address  : bioftp.unibas.ch (or 131.152.8.1)
    Directory: /archive_data/brand_new/swissprot.24/asn


The following files are available:

1) sp24.prt.Z

A compressed, ASN.1 value in "print-form" (ie, ASCII),  generated by parsing
release 24 of SWISS-PROT. Uncompressed, this file is approximately 150 Mb in
size.

We have  chosen  to  supply  the ASN.1 version of SWISS-PROT in "print-form"
primarily for human-readability.  Converted  to  a binary ASN.1 value, these
data require roughly 60 Mb.  Public  domain  tools developed at the NCBI are
available  for such conversions. Software libraries are also  available  for
processing ASN.1 values (binary or "print-form") at a fairly high level.  To
obtain information about the NCBI toolbox, ftp to the address `ncbi.nlm.nih.
gov', login  as `anonymous', change   to  the `toolbox'  directory, and copy
the README file you will find there. Or, send a request for more information
via email to `info at ncbi.nlm.nih.gov'. 

2) asn.all

This file contains the ASN.1 specifications  for  various  biomolecular data
bases (including GenBank, EMBL, PDB, etc.).  A subset  of  the data elements
defined in the specification  file were used to map SWISS-PROT to ASN.1.  As
of Feb. 3, 1993, there are some known limitations in this mapping:

a) Taxonomy data are available only indirectly, via swissprot-taxon pointers
   in  Org-ref  elements,  referring  to  node identifiers of the SWISS-PROT
   taxonomy (see the SWISS-PROT document file `speclist.txt').

b) Reference position (RP) data are not yet captured.

c) Generation of Prot-ref elements is very simplistic; an improved algorithm
   would use  the  contents of the feature table to generate  Prot-refs that
   point to specific portions of the sequence.


Important notes

- As  improvements are made to the  parser  new versions of `sp24.prt.Z' may
  be posted on the servers listed above.
- The Department of Medical Biochemistry of the University of Geneva and the
  EMBL Data Library are not  responsible for  any  problems arising from the
  use of the ASN.1 version of SWISS-PROT.  What this cryptic legal  sentence
  means is that the "official" version of SWISS-PROT  is the one distributed
  jointly by the University of  Geneva and the EMBL Data Library;  the ASN.1
  version is not yet officialy supported.


Demonstration programs

Two demo programs  compiled on a  SPARC2 (gcc, SunOS 4.1.1) and a supporting
index file  are  available  from  the  NCBI  server,  for use with the ASN.1
version of  SWISS-PROT.  They  are  being  supplied  strictly  "as-is",  for
demonstration purposes only.  See the README file in the `/repository/swiss-
prot/asn/demo' directory for more information.



More information about the Bionews mailing list