Release 9 of TREMBL, a protein sequence database supplementing SWISS-PROT

Maria Jesus Martin martin at ebi.ac.uk
Mon Feb 1 05:53:34 EST 1999


INTRODUCTION
============

TrEMBL is a protein sequence database supplementing the SWISS-PROT
Protein Sequence Data Bank. TrEMBL contains the translations of all
coding sequences (CDS) present in the EMBL Nucleotide Sequence
Database not yet integrated in SWISS-PROT. TrEMBL can be considered
as a preliminary section of SWISS-PROT. For all TrEMBL entries
which should finally be upgraded to the standard SWISS-PROT
quality, SWISS-PROT accession numbers have been assigned.


RELEASE 9.0 OF TrEMBL
=====================

The goal of this TrEMBL release is to achieve synchronization with
SWISS-PROT release 37.0. Therefore, all sequence entries present in
SWISS-PROT release 37.0 have been removed from TrEMBL release 9,
further upgrading of existing TrEMBL entries was achieved and only a
very few new entries were incorporated. As a result of the
synchronization with SWISS-PROT release 37.0, TrEMBL release 9
contains less entries that the previous release.

TrEMBL release 9 contains 221422 sequence entries, comprising
59'461'791 amino acids.

TrEMBL is split in two main sections; SP-TrEMBL and REM-TrEMBL:

SP-TrEMBL (SWISS-PROT TrEMBL) contains the entries (179'066) which
should be eventually incorporated into SWISS-PROT. SWISS-PROT
accession numbers have been assigned for all SP-TrEMBL entries.

SP-TrEMBL is organized in subsections:

arc.dat (Archaea):              7315 entries
fun.dat (Fungi):                5862 entries
hum.dat (Human):                7594 entries
inv.dat (Invertebrates):       22665 entries
mam.dat (Other Mammals):        2792 entries
mhc.dat (MHC proteins):         3981 entries
org.dat (Organelles):          13996 entries
phg.dat (Bacteriophages):       1736 entries
pln.dat (Plants):              14626 entries
pro.dat (Prokaryotes):         39243 entries
rod.dat (Rodents):              6863 entries
unc.dat (Unclassified):           44 entries
vrl.dat (Viruses):             48436 entries
vrt.dat (Other Vertebrates):    3913 entries


REM-TrEMBL (REMaining TrEMBL) contains the entries (42'356) that we do
not want to include in SWISS-PROT.


WEEKLY UPDATES OF TrEMBL AND NON-REDUNDANT DATA SETS
====================================================
Weekly cumulative updates of TrEMBL are available by anonymous FTP and
from the EBI SRS server.
We also produce every week a complete non-redundant protein sequence
collection by providing three compressed files (these are in the
directory /pub/databases/sp_tr_nrdb on the EBI FTP server):
sprot.dat.Z, trembl.dat.Z and trembl_new.dat.Z.


ACCESS/DATA DISTRIBUTION
========================

FTP server:     ftp.ebi.ac.uk/pub/databases/trembl
SRS server:     http://srs.ebi.ac.uk/

TREMBL is also available on the SWISS-PROT CD-ROM.
SWISS-PROT + TREMBL is searchable on the FASTA3, BLAST2 and Bic_sw
servers of the EBI.



TrEMBL HAS BEEN PREPARED BY:
============================

Rolf Apweiler, Kirsty Bates, Sergio Contrino, Wolfgang Fleischmann,
Gill Fraser,Henning Hermjakob, Vivien Junker, Youla Karavidopoulou,
Fiona Lang, Michele Magrane, Maria Jesus Martin, Steffen Moeller,
Nicoletta Mitaritonna, Claire O'Donovan and Eleanor Whitfield at the
EMBL Outstation - European Bioinformatics Institute (EBI) in Hinxton,
UK;
Amos Bairoch and Alain Gateau at the the Swiss Institute of
Bioinformatics in Geneva, Switzerland.



-----------------------------------------------------------------
Maria Jesus Martin                     email:martin at ebi.ac.uk
EMBL Outstation EBI
(European Bioinformatics Institute)    URL: http://www.ebi.ac.uk
Wellcome Trust Genome Campus           Tel: +44 (1223) 494408
Hinxton                                fax: +44 (1223) 494468
Cambridge
CB10 1SD UK
-----------------------------------------------------------------






More information about the Proteins mailing list