Release 18 of TREMBL, a protein sequence database supplementing SWISS-PROT

Maria Jesus Martin martin at ebi.ac.uk
Mon Oct 29 20:24:41 EST 2001


INTRODUCTION
============

TrEMBL is a computer-annotated protein sequence database
supplementing the SWISS-PROT Protein Knowledgebase. TrEMBL
contains the translations of all coding sequences (CDS)
present in the EMBL Nucleotide Sequence Database not yet
integrated in SWISS-PROT. TrEMBL can be considered as a
preliminary section of SWISS-PROT. For all TrEMBL entries
which should finally be upgraded to the standard SWISS-PROT
quality, SWISS-PROT accession numbers have been assigned.


RELEASE 18.0 OF TrEMBL
=====================

The goal of this TrEMBL release is to achieve synchronization
with the SWISS-PROT Protein Knowledgebase release 40.0.
Therefore all sequence entries present in SWISS-PROT release
40.0 have been removed from TrEMBL release 18. In addition,
there was further upgrading of existing TrEMBL entries and
some new entries were incorporated. It contains 558'150
entries and 160'420'778 amino acids.

TrEMBL is split in two main sections: SP-TrEMBL and REM-TrEMBL:
SP-TrEMBL (SWISS-PROT TrEMBL) contains the entries (484'551)
which should be eventually incorporated into SWISS-PROT.
SWISS-PROT accession numbers have been assigned for all
SP-TrEMBL entries.

SP-TrEMBL is organized in subsections:

arc.dat (Archaea):             18384 entries
fun.dat (Fungi):               12481 entries
hum.dat (Human):               22925 entries
inv.dat (Invertebrates):       54665 entries
mam.dat (Other Mammals):        8280 entries
mhc.dat (MHC proteins):         6813 entries
org.dat (Organelles):          41585 entries
phg.dat (Bacteriophages):       3892 entries
pln.dat (Plants):              50806 entries
pro.dat (Prokaryotes):        125274 entries
rod.dat (Rodents):             21335 entries
unc.dat (Unclassified):          135 entries
vrl.dat (Viruses):            108309 entries
vrt.dat (Other Vertebrates):    9667 entries

17'914 new entries have been integrated in SP-TrEMBL. The
sequences of 1634 SP-TrEMBL entries have been updated and the
annotation has been updated in 120'529 entries.

In the document deleteac.txt, you will find a list of all
accession numbers which were previously present in TrEMBL, but
which have now been deleted from the database.

REM-TrEMBL (REMaining TrEMBL) contains the entries (73'599)
that we do not want to include in SWISS-PROT.


ACCESS/DATA DISTRIBUTION
========================

FTP server:     ftp.ebi.ac.uk/pub/databases/trembl
SRS server:     http://srs.ebi.ac.uk/

TrEMBL is also available on the SWISS-PROT CD-ROM.
SWISS-PROT + TrEMBL is searchable on the following
servers at the EBI:

FASTA3  (http://www.ebi.ac.uk/fasta33/)
BLAST2  (http://www.ebi.ac.uk/blast2/)
Bic_sw  (http://www.ebi.ac.uk/bic_sw/)
Scanps  (http://www.ebi.ac.uk/scanps/)
MPSrch  (http://www.ebi.ac.uk/MPsrch/)

TrEMBL HAS BEEN PREPARED BY:
============================

Rolf Apweiler, Kirsty Bates, Margaret Biswas,
Sergio Contrino, Kirill Degtyarenko, Wolfgang Fleischmann,
Gill Fraser, Henning Hermjakob, Alexander Kanapin,
Youla Karavidopoulou, Paul Kersey, Minna Lehvaslaiho,
Michele Magrane, Maria Jesus Martin, Virginie Mittard,
Nicola Mulder, Claire O'Donovan, John F. O'Rourke,
Eleanor Whitfield and Allyson Williams
at the EMBL Outstation - European Bioinformatics Institute (EBI)
in Hinxton, UK;
Amos Bairoch, Isabelle Phan, Sandrine Pilbout and
Alain Gateau at the Swiss Institute of Bioinformatics in
Geneva, Switzerland.



------------------------------------------------
Maria Jesus Martin                     email:martin at ebi.ac.uk
EMBL Outstation EBI
(European Bioinformatics Institute)    URL: http://www.ebi.ac.uk
Wellcome Trust Genome Campus           Tel: +44 (1223) 494408
Hinxton                                fax: +44 (1223) 494468
Cambridge
CB10 1SD UK
------------------------------------------------




More information about the Bionews mailing list