EXPASY - A NEW MOLECULAR BIOLOGY SERVER
Ron D. Appel
appel at cih.hcuge.ch
Tue Sep 21 08:42:43 EST 1993
EXPASY - A NEW MOLECULAR BIOLOGY SERVER
SWISS-2DPAGE - A 2D PAGE IMAGE DATABASE
A new molecular biology server has been set up at Geneva University.
Through a user-friendly hypertext model it allows easy browsing
through the data from various databases, such as the SWISS-PROT
protein Sequence Database , the EMBL nucleotide sequence data-
base , PROSITE (protein sites and patterns) , the OMIM Online
Mendelian Inheritance in Man database  and the newly created
SWISS-2DPAGE database, which gathers data on proteins identified
on various 2D PAGE (Two-Dimensional Polyacrylamide Gel Electro-
phoresis) maps .
1. The ExPASy molecular biology server
The ExPASy molecular biology server can be accessed remotely from
any computer connected on the internet and allows easy browsing,
using hypertext documents and links, through various molecular biol-
ogy databases, such as SWISS-PROT, EMBL, OMIM, PROSITE,
REBASE, SWISS-2DPAGE, and soon FLYBASE and other databases.
ExPASy has been set up as a World-Wide Web server, a powerful infor-
mation retrieval system (see next section). The user may request, for
example, an entry from the SWISS-PROT protein sequence database,
either by giving the accession number (AC) of the desired entry or
through keyword search for the protein description or the name of a ref-
erenced author. For example, by typing apolipoprotein (or simply apo-
lipo), the user will get the list of all the apolipoproteins currently
contained in SWISS-PROT, out of which he/she will have to choose
one. In the SWISS-PROT entry, the database cross-reference lines
(DR) to the currently available databases are active hypertext links.
Selecting one of these links will fetch the cross-referenced entry from
the corresponding database. For example, selecting the cross-reference
line to EMBL will fetch and display the nucleotide sequence (and
related data) that encode for the given protein. Choosing the SWISS-
2DPAGE cross-reference will display the corresponding entry from
2. The World-Wide Web model
The ExPASy molecular biology server has been set up as a World-Wide
Web (WWW) server. WWW, which originated at CERN, is a powerful
global information system merging networked information retrieval
and hypertext . It gives access, using hypertext links, to the docu-
ments and information contained in all the existing WWW servers
around the world, as well as to the data obtainable through other infor-
mation retrieval systems like WAIS, Gopher, X500, etc. To access a
WWW server, one has to run on a local computer a client program (a
WWW browser), which displays hypertext documents. The user can
then either request a keyword search or jump to another document by
following a hypertext link. See section 4 to find out how to obtain a
SWISS-2DPAGE is a new database which groups data from various
reference 2D PAGE maps. Information on mapping procedures, physi-
ological and pathological data and bibliographical references are also
provided, as well as links to the SWISS-PROT protein sequence and
other related databases. Protein maps may be displayed, showing the
protein locations. The server allows also to display the theoretical loca-
tion of proteins, the positions of which are not yet known.
3.1 Database structure
Each entry in the SWISS-2DPAGE database corresponds to one protein
and contains textual as well as image data. The textual part of the data-
base follows the conventions used in the SWISS-PROT protein
sequence database. An entry is composed of lines of different types.
The ID (IDentification) line contains the entry name, the entry class
and a three letter code indicating the type of entry. The entry class may
be STANDARD (for data which are complete) or PRELIMINARY (for
entries in which certain information are missing or have not yet been
checked). In SWISS-2DPAGE the three letter code is 2DG (for 2
Dimensional Gel). The AC (ACcession number) line gives an acces-
sion number which uniquely determines the entry. The entry name in
the ID line and the accession number are the same than those in the cor-
responding entry in SWISS-PROT. The DT (DaTe) lines indicate the
entry's date of creation and last update. The DE (DEscription) line con-
tains general descriptive information about the protein. The reference
lines (RN, RP, RC, RM, RA and RL) follow the syntax defined in
SWISS-PROT. The CC lines contain free text comments on the entry,
for example the subunit composition (single chain, dimer, etc.). Finally,
each entry holds a DR (Database cross-Reference) line pointing to the
corresponding protein sequence entry in SWISS-PROT.
Three line types are specific to SWISS-2DPAGE. The MT (MasTer)
line tells what types of maps the protein has been identified on. Exam-
ples are LIVER and PLASMA. The IM (IMages) lines list the 2D
PAGE images which are associated to the entry. These may be, for
example, TUMOR LIVER or NORMAL LIVER. The third line type is
2D (2 Dimensional gel) which gives specific information such as map-
ping procedure (matching with another gel, microsequencing, etc.), the
number, molecular weight and pI of spots, or describes normal and
The image part of a SWISS-2DPAGE entry on ExPASy shows the
available 2D PAGE maps in icon form (around 100 x 100 pixels).
These are arranged in two groups: the maps on which the protein has
been identified, and some maps on which it has not yet been found.
Each of these iconified maps is a hypertext link. By selecting a map of
the first group, a request is sent to the server to send the full-sized map
with the spots corresponding to the entry's protein highlighted in yel-
low, as well as the region around the computed theoretical pI and
molecular weight. When selecting one of the protein maps of the sec-
ond group, on which the protein location is unknown, the theoretical pI
and molecular weight are computed from the protein's sequence and a
Compuserve GIF  image is built, showing the selected map with the
region in which the protein is expected to be found. If a protein consists
of more than one polypeptidic chain, then several regions are high-
lighted on the gel. The computation of the region takes also into
account possible phosphorylation, acetylation and glycosylation sites
in the sequence. When using a WWW browser which handles images,
those are usually displayed locally by running an image viewing pro-
gram. For example, with NCSA Mosaic a protein map is displayed
using the xv shareware program. The image may then be processed,
saved or printed like any other image displayed by xv.
4. Accessing the ExPASy molecular biology server
4.1 General access
A WWW server can be accessed on the internet through its Uniform
Resource Locator (URL), the addressing system defined by the WWW
model. The URL for the ExPASy molecular biology WWW server is:
on the current machine).
To access a WWW server, one needs to run a browser (or client) pro-
gram on his/her local computer. Browsers exist for a variety of
machines and may be obtained by anonymous ftp. Here is a selected
list (taken from the CERN WWW server) of currently available brows-
ers and the ftp address from which they can be retrieved:
4.1.1 Terminal based browsers
o www: a basic line mode browser giving access to WWW from any
dumb terminal. This browser requires the xv program to display
images (see section 4.2). Ftp site: info.cern.ch (in /pub/www).
o lynx: a full screen browser for vt100s using full screen, arrow keys,
highlighting, etc. It requires the xv program to display images (see
section 4.2). Ftp site: ftp2.cc.ukans.edu (in /pub/lynx).
4.1.2 Graphic User Interfaces
o NCSA Mosaic for XWindows: a browser using X11/Motif. This is
one of the most flexible and most robust browsers currently avail-
able for XWindows. It requires the xv program to display images
(see section 4.2). Ftp site: ftp.ncsa.uiuc.edu (in /Web/xmosaic). See
o NCSA Mosaic versions for Microsoft Windows and for Macintosh
have been announced.
o Cello: a PC/Windows browser in beta release. Ftp site:
fatty.law.cornel.edu (in /pub/LII/cello).
o Samba for Macintosh. Ftp site: info.cern.ch (in /pub/www/bin/mac).
4.2 Getting the xv image viewing program for X
To access all the data available from SWISS-2DPAGE, the user's local
computer needs to run an image viewing program. For most browsers
on Unix workstations the default program is xv, a shareware application
developed by John Bradley at University of Pennsylvania. The pro-
gram can be found by ftp at export.lcs.mit.edu (in /contrib).
4.3 Accessing SWISS-2DPAGE and ExPASy using NCSA Mosaic
To fully use the features available on SWISS-2DPAGE and the
ExPASy server from a Unix workstation, we recommend using NCSA
Mosaic. You need to download the xmosaic program by anonymous ftp
from ftp.ncsa.uiuc.edu (the binaries for various Unix workstations are
in directory /Web/xmosaic), as well as xv from export.lcs.mit.edu (in /
To access the ExPASy server, type:
or use the Open button and type in the ExPASy URL
(http://expasy.hcuge.ch/). This should display ExPASy's top page,
from which one may select the links to SWISS-PROT, SWISS-2DAGE, or
any other document.
1 Bairoch, A. and Boeckman, B., Nucleic Acids Res. 1993, 21, 3093-
2 Rice, C.M., Fuchs, R., Higgins, D.G., Stoehr, P.J. and Cameron,
G.N., Nucleic Acids Res. 1993, 21, 2967-71.
3 Bairoch, A., Nucleic Acids Res. 1993, 21, 3097-3103.
4 McKusick, V.A., Mendelian Inheritance in Man. Catalogs of auto-
somal dominant, autosomal recessive, and X-linked phenotypes;
Tenth edition; Johns Hopkins University Press, Baltimore 1991.
5 Appel, R.D, Sanchez, J.C., Bairoch, A, Golaz, O., Miu, M., Vargas,
J.R., Hochstrasser, D.F., SWISS-2DPAGE: a database of two-
dimensional gel electrophoresis images, Electrophoresis 1993, in
6 Berners-Lee, T.J., Cailliau, R., Groff, J.F. and Pollermann, B.,
Electronic Networking: Research, Applications and Policy, 1992,
7 Rimmer, S., Bit-Mapped Graphics. Windcrest Books/ Mc Graw
Hill, Blue Ridge Summit, PA 1990, pp. 129-193.
In case of problems or if you have comments, please contact:
More information about the Bionews