Release 17 of TREMBL, a protein sequence database supplementing SWISS-PROT

Maria Jesus Martin martin at
Mon Jun 18 05:31:38 EST 2001


TrEMBL is a protein sequence database supplementing the
SWISS-PROT Protein Sequence Data Bank. TrEMBL contains the
translations of all coding sequences (CDS) present in the EMBL
Nucleotide Sequence Database not yet integrated in SWISS-PROT.
TrEMBL can be considered as a preliminary section of SWISS-PROT.
For all TrEMBL entries which should finally be upgraded to the
standard SWISS-PROT quality, SWISS-PROT accession numbers have
been assigned.


This TrEMBL release was created from the EMBL Nucleotide Sequence
Database release 66 and updates up to 01.05.01 and contains 540'195
sequence entries, comprising 155'771'315 amino acids. To minimize
redundancy, the translations of all coding sequences (CDS) in the
EMBL Nucleotide Sequence Database already included in SWISS-PROT
release 39.21.

TrEMBL is split in two main sections: SP-TrEMBL and REM-TrEMBL:
SP-TrEMBL (SWISS-PROT TrEMBL) contains the entries (473'505) which
should beeventually incorporated into SWISS-PROT. SWISS-PROT
accession numbers have been assigned for all SP-TrEMBL entries.

SP-TrEMBL is organized in subsections:

arc.dat (Archaea):             14653 entries
fun.dat (Fungi):               12773 entries
hum.dat (Human):               24037 entries
inv.dat (Invertebrates):       56712 entries
mam.dat (Other Mammals):        8380 entries
mhc.dat (MHC proteins):         6821 entries
org.dat (Organelles):          41900 entries
phg.dat (Bacteriophages):       3895 entries
pln.dat (Plants):              50780 entries
pro.dat (Prokaryotes):        113140 entries
rod.dat (Rodents):             12312 entries
unc.dat (Unclassified):          135 entries
vrl.dat (Viruses):            108495 entries
vrt.dat (Other Vertebrates):    9812 entries

54'960 new entries have been integrated in SP-TrEMBL. The sequences
of 609 SP-TrEMBL entries have been updated and the annotation
has been updated in 299'529 entries.

In the document deleteac.txt, you will find a list of all accession
numbers which were previously present in TrEMBL, but which have now
been deleted from the database.

REM-TrEMBL (REMaining TrEMBL) contains the entries (66'690) that we
do not want to include in SWISS-PROT.


FTP server:
SRS server:

TrEMBL is also available on the SWISS-PROT CD-ROM.
SWISS-PROT + TrEMBL is searchable on the following servers at
the EBI:

Bic_sw  (
Scanps  (
MPSrch  (


Rolf Apweiler, Kirsty Bates, Margaret Biswas, Sergio Contrino,
Kirill Degtyarenko, Wolfgang Fleischmann, Gill Fraser,
Henning Hermjakob, Vivien Junker, Alexander Kanapin, Youla
Karavidopoulou, Paul Kersey, Minna Lehvaslaiho, Michele Magrane,
Maria Jesus Martin, Nicoletta Mitaritonna, Virginie Mittard,
Steffen Moeller, Nicola Mulder, Claire O'Donovan,
John F. O'Rourke, Isabelle Phan, Sandrine Pilbout,
Eleanor Whitfield and Allyson Williams
at the EMBL Outstation - European Bioinformatics Institute (EBI)
in Hinxton, UK;
Amos Bairoch and Alain Gateau at the Swiss Institute of
Bioinformatics in Geneva, Switzerland.

Maria Jesus Martin                     email:martin at
EMBL Outstation EBI
(European Bioinformatics Institute)    URL:
Wellcome Trust Genome Campus           Tel: +44 (1223) 494408
Hinxton                                fax: +44 (1223) 494468

