EXPASY - A NEW MOLECULAR BIOLOGY SERVER

Ron D. Appel appel at cih.hcuge.ch
Tue Sep 21 08:42:43 EST 1993


EXPASY - A NEW MOLECULAR BIOLOGY SERVER
SWISS-2DPAGE - A 2D PAGE IMAGE DATABASE
=======================================

A new molecular biology server has been set up at Geneva University. 
Through a user-friendly hypertext model it allows easy browsing 
through the data from various databases, such as the SWISS-PROT 
protein Sequence Database [1], the EMBL nucleotide sequence data-
base [2], PROSITE (protein sites and patterns) [3], the OMIM Online 
Mendelian Inheritance in Man database [4] and the newly created 
SWISS-2DPAGE database, which gathers data on proteins identified 
on various 2D PAGE (Two-Dimensional Polyacrylamide Gel Electro-
phoresis) maps [5]. 

1. The ExPASy molecular biology server
   -----------------------------------
The ExPASy molecular biology server can be accessed remotely from 
any computer connected on the internet and allows easy browsing, 
using hypertext documents and links, through various molecular biol-
ogy databases, such as SWISS-PROT, EMBL, OMIM, PROSITE, 
REBASE, SWISS-2DPAGE, and soon FLYBASE and other databases. 
ExPASy has been set up as a World-Wide Web server, a powerful infor-
mation retrieval system (see next section). The user may request, for 
example, an entry from the SWISS-PROT protein sequence database, 
either by giving the accession number (AC) of the desired entry or 
through keyword search for the protein description or the name of a ref-
erenced author. For example, by typing apolipoprotein (or simply apo-
lipo), the user will get the list of all the apolipoproteins currently 
contained in SWISS-PROT, out of which he/she will have to choose 
one. In the SWISS-PROT entry, the database cross-reference lines 
(DR) to the currently available databases are active hypertext links. 
Selecting one of these links will fetch the cross-referenced entry from 
the corresponding database. For example, selecting the cross-reference 
line to EMBL will fetch and display the nucleotide sequence (and 
related data) that encode for the given protein. Choosing the SWISS-
2DPAGE cross-reference will display the corresponding entry from 
SWISS-2DPAGE. 

2. The World-Wide Web model
   ------------------------
The ExPASy molecular biology server has been set up as a World-Wide 
Web (WWW) server. WWW, which originated at CERN, is a powerful 
global information system merging networked information retrieval 
and hypertext [6]. It gives access, using hypertext links, to the docu-
ments and information contained in all the existing WWW servers 
around the world, as well as to the data obtainable through other infor-
mation retrieval systems like WAIS, Gopher, X500, etc. To access a 
WWW server, one has to run on a local computer a client program (a 
WWW browser), which displays hypertext documents. The user can 
then either request a keyword search or jump to another document by 
following a hypertext link. See section 4 to find out how to obtain a 
browser.

3. SWISS-2DPAGE
   ------------
SWISS-2DPAGE is a new database which groups data from various 
reference 2D PAGE maps. Information on mapping procedures, physi-
ological and pathological data and bibliographical references are also 
provided, as well as links to the SWISS-PROT protein sequence and 
other related databases. Protein maps may be displayed, showing the 
protein locations. The server allows also to display the theoretical loca-
tion of proteins, the positions of which are not yet known. 

3.1  Database structure
     ------------------
Each entry in the SWISS-2DPAGE database corresponds to one protein 
and contains textual as well as image data. The textual part of the data-
base follows the conventions used in the SWISS-PROT protein 
sequence database. An entry is composed of lines of different types. 
The ID (IDentification) line contains the entry name, the entry class 
and a three letter code indicating the type of entry. The entry class may 
be STANDARD (for data which are complete) or PRELIMINARY (for 
entries in which certain information are missing or have not yet been 
checked). In SWISS-2DPAGE the three letter code is 2DG (for 2 
Dimensional Gel). The AC (ACcession number) line gives an acces-
sion number which uniquely determines the entry. The entry name in 
the ID line and the accession number are the same than those in the cor-
responding entry in SWISS-PROT. The DT (DaTe) lines indicate the 
entry's date of creation and last update. The DE (DEscription) line con-
tains general descriptive information about the protein. The reference 
lines (RN, RP, RC, RM, RA and RL) follow the syntax defined in 
SWISS-PROT. The CC lines contain free text comments on the entry, 
for example the subunit composition (single chain, dimer, etc.). Finally, 
each entry holds a DR (Database cross-Reference) line pointing to the 
corresponding protein sequence entry in SWISS-PROT.

Three line types are specific to SWISS-2DPAGE. The MT (MasTer) 
line tells what types of maps the protein has been identified on. Exam-
ples are LIVER and PLASMA. The IM (IMages) lines list the 2D 
PAGE images which are associated to the entry. These may be, for 
example, TUMOR LIVER or NORMAL LIVER. The third line type is 
2D (2 Dimensional gel) which gives specific information such as map-
ping procedure (matching with another gel, microsequencing, etc.), the 
number, molecular weight and pI of spots, or describes normal and 
pathological variants.

The image part of a SWISS-2DPAGE entry on ExPASy shows the 
available 2D PAGE maps in icon form (around 100 x 100 pixels). 
These are arranged in two groups: the maps on which the protein has 
been identified, and some maps on which it has not yet been found. 
Each of these iconified maps is a hypertext link. By selecting a map of 
the first group, a request is sent to the server to send the full-sized map 
with the spots corresponding to the entry's protein highlighted in yel-
low, as well as the region around the computed theoretical pI and 
molecular weight. When selecting one of the protein maps of the sec-
ond group, on which the protein location is unknown, the theoretical pI 
and molecular weight are computed from the protein's sequence and a 
Compuserve GIF [7] image is built, showing the selected map with the 
region in which the protein is expected to be found. If a protein consists 
of more than one polypeptidic chain, then several regions are high-
lighted on the gel. The computation of the region takes also into 
account possible phosphorylation, acetylation and glycosylation sites 
in the sequence. When using a WWW browser which handles images, 
those are usually displayed locally by running an image viewing pro-
gram. For example, with NCSA Mosaic a protein map is displayed 
using the xv shareware program. The image may then be processed, 
saved or printed like any other image displayed by xv.

4. Accessing the ExPASy molecular biology server
   ---------------------------------------------
4.1  General access
     --------------
A WWW server can be accessed on the internet through its Uniform 
Resource Locator (URL), the addressing system defined by the WWW 
model. The URL for the ExPASy molecular biology WWW server is:

    http://expasy.hcuge.ch/

(or 

    http://129.195.254.61/

on the current machine).

To access a WWW server, one needs to run a browser (or client) pro-
gram on his/her local computer. Browsers exist for a variety of 
machines and may be obtained by anonymous ftp. Here is a selected 
list (taken from the CERN WWW server) of currently available brows-
ers and the ftp address from which they can be retrieved:

4.1.1  Terminal based browsers
       -----------------------
o   www: a basic line mode browser giving access to WWW from any 
    dumb terminal. This browser requires the xv program to display 
    images (see section 4.2). Ftp site: info.cern.ch (in /pub/www). 

o   lynx: a full screen browser for vt100s using full screen, arrow keys, 
    highlighting, etc. It requires the xv program to display images (see 
    section 4.2). Ftp site: ftp2.cc.ukans.edu (in /pub/lynx).

4.1.2  Graphic User Interfaces
       -----------------------
o   NCSA Mosaic for XWindows: a browser using X11/Motif. This is 
    one of the most flexible and most robust browsers currently avail-
    able for XWindows. It requires the xv program to display images 
    (see section 4.2). Ftp site: ftp.ncsa.uiuc.edu (in /Web/xmosaic). See 
    section 4.3.

o   NCSA Mosaic versions for Microsoft Windows and for Macintosh 
    have been announced.

o   Cello: a PC/Windows browser in beta release. Ftp site: 
    fatty.law.cornel.edu (in /pub/LII/cello).

o   Samba for Macintosh. Ftp site: info.cern.ch (in /pub/www/bin/mac).

4.2  Getting the xv image viewing program for X
     ------------------------------------------
To access all the data available from SWISS-2DPAGE, the user's local 
computer needs to run an image viewing program. For most browsers 
on Unix workstations the default program is xv, a shareware application 
developed by John Bradley at University of Pennsylvania. The pro-
gram can be found by ftp at export.lcs.mit.edu (in /contrib).

4.3  Accessing SWISS-2DPAGE and ExPASy using NCSA Mosaic
     ---------------------------------------------------
To fully use the features available on SWISS-2DPAGE and the 
ExPASy server from a Unix workstation, we recommend using NCSA 
Mosaic. You need to download the xmosaic program by anonymous ftp 
from ftp.ncsa.uiuc.edu (the binaries for various Unix workstations are 
in directory /Web/xmosaic), as well as xv from export.lcs.mit.edu (in /
contrib). 

To access the ExPASy server, type:

    xmosaic http://expasy.hcuge.ch/

or use the Open button and type in the ExPASy URL 
(http://expasy.hcuge.ch/). This should display ExPASy's top page, 
from which one may select the links to SWISS-PROT, SWISS-2DAGE, or 
any other document.

5. References
   ----------
1   Bairoch, A. and Boeckman, B., Nucleic Acids Res. 1993, 21, 3093-
    3096.

2   Rice, C.M., Fuchs, R., Higgins, D.G., Stoehr, P.J. and Cameron, 
    G.N., Nucleic Acids Res. 1993, 21, 2967-71.

3   Bairoch, A., Nucleic Acids Res. 1993, 21, 3097-3103. 

4   McKusick, V.A., Mendelian Inheritance in Man. Catalogs of auto-
    somal dominant, autosomal recessive, and X-linked phenotypes; 
    Tenth edition; Johns Hopkins University Press, Baltimore 1991.

5   Appel, R.D, Sanchez, J.C., Bairoch, A, Golaz, O., Miu, M., Vargas, 
    J.R., Hochstrasser, D.F., SWISS-2DPAGE: a database of two-
    dimensional gel electrophoresis images, Electrophoresis 1993, in 
    press.

6   Berners-Lee, T.J., Cailliau, R., Groff, J.F. and Pollermann, B., 
    Electronic Networking: Research, Applications and Policy, 1992, 
    2, 52-58.

7   Rimmer, S., Bit-Mapped Graphics. Windcrest Books/ Mc Graw 
    Hill, Blue Ridge Summit, PA 1990, pp. 129-193.

-----------

In case of problems or if you have comments, please contact:




More information about the Bionews mailing list