Release 16 of TREMBL, a protein sequence database supplementing SWISS-PROT

Maria Jesus Martin martin at
Fri Mar 9 22:41:06 EST 2001


TrEMBL is a protein sequence database supplementing the SWISS-PROT
Protein Sequence Data Bank. TrEMBL contains the translations of all
coding sequences (CDS) present in the EMBL Nucleotide Sequence
Database not yet integrated in SWISS-PROT. TrEMBL can be considered
as a preliminary section of SWISS-PROT. For all TrEMBL entries
which should finally be upgraded to the standard SWISS-PROT
quality, SWISS-PROT accession numbers have been assigned.


This TrEMBL release was created from the EMBL Nucleotide Sequence
Database release 65 and updates up to 22.01.01 and contains 489'620
sequence entries, comprising 141'347'364 amino acids. To minimize
redundancy, the translations of all coding sequences (CDS) in the
EMBL Nucleotide Sequence Database already included in SWISS-PROT
release 40 and updates up to 21.2.2001 have been removed from TrEMBL
release 16.

TrEMBL is split in two main sections: SP-TrEMBL and REM-TrEMBL:
SP-TrEMBL (SWISS-PROT TrEMBL) contains the entries (425'026) which
should be eventually incorporated into SWISS-PROT. SWISS-PROT accession
numbers have been assigned for all SP-TrEMBL entries.

SP-TrEMBL is organized in subsections:

arc.dat (Archaea):             15191 entries
fun.dat (Fungi):               11819 entries
hum.dat (Human):               21314 entries
inv.dat (Invertebrates):       54506 entries
mam.dat (Other Mammals):        7281 entries
mhc.dat (MHC proteins):         6568 entries
org.dat (Organelles):          38007 entries
phg.dat (Bacteriophages):       3301 entries
pln.dat (Plants):              46050 entries
pro.dat (Prokaryotes):         98330 entries
rod.dat (Rodents):             12312 entries
unc.dat (Unclassified):           54 entries
vrl.dat (Viruses):            101091 entries
vrt.dat (Other Vertebrates):    9202 entries

59'565 new entries have been integrated in SP-TrEMBL. The sequences of
1263 SP-TrEMBL entries have been updated and the annotation has been
updated in 198'646 entries.

In the document deleteac.txt, you will find a list of all accession
numbers which were previously present in TrEMBL, but which have now been

deleted from the database.

REM-TrEMBL (REMaining TrEMBL) contains the entries (64'594) that we do
not want to include in SWISS-PROT.


FTP server:
SRS server:

TrEMBL is also available on the SWISS-PROT CD-ROM.
SWISS-PROT + TrEMBL is searchable on the following servers at the EBI:

Bic_sw  (
Scanps  (
MPSrch  (


Rolf Apweiler, Kirsty Bates, Margaret Biswas, Sergio Contrino,
Kirill Degtyarenko, Wolfgang Fleischmann, Gill Fraser,
Cathy Gedman, Henning Hermjakob, Vivien Junker, Alexander Kanapin,
Youla Karavidopoulou, Paul Kersey, Minna Lehvaslaiho,
Michele Magrane, Maria Jesus Martin, Nicoletta Mitaritonna,
Virginie Mittard, Steffen Moeller, Nicola Mulder, Claire O'Donovan,
John F. O'Rourke, Isabelle Phan, Sandrine Pilbout, Lucia
Eleanor Whitfield and Allyson Williams
at the EMBL Outstation - European Bioinformatics Institute (EBI)
in Hinxton, UK;
Amos Bairoch and Alain Gateau at the Swiss Institute of Bioinformatics
in Geneva, Switzerland.

Maria Jesus Martin                     email:martin at
EMBL Outstation EBI
(European Bioinformatics Institute)    URL:
Wellcome Trust Genome Campus           Tel: +44 (1223) 494408
Hinxton                                fax: +44 (1223) 494468

More information about the Bionews mailing list