Release 7 of TrEMBL, the protein sequence database supplementing SWISS-PROT

Rolf Apweiler apweiler at ebi.ac.uk
Fri Aug 7 11:27:36 EST 1998


INTRODUCTION
============

TrEMBL is the protein sequence database supplementing the SWISS-PROT
Protein Sequence Data Bank. TrEMBL contains the translations of all
coding sequences (CDS) present in the EMBL Nucleotide Sequence
Database not yet integrated in SWISS-PROT. TrEMBL can be considered
as a preliminary section of SWISS-PROT. For all TrEMBL entries
which should finally be upgraded to the standard SWISS-PROT
quality, SWISS-PROT accession numbers have been assigned.


RELEASE 7.0 OF TREMBL
=====================

This TrEMBL release is created from the EMBL Nucleotide Sequence
Database release 55 and contains 193'860 sequence entries,
comprising 53'601'062 amino acids.

TrEMBL is split in two main sections; SP-TrEMBL and REM-TrEMBL:

SP-TrEMBL (SWISS-PROT TrEMBL) contains the entries (165'420), which
should be eventually incorporated into SWISS-PROT. SWISS-PROT
accession numbers have been assigned for all SP-TrEMBL entries.

SP-TrEMBL is organized in subsections:

arc.dat (Archea):               7434 entries
fun.dat (Fungi):                5261 entries
hum.dat (Human):                6976 entries
inv.dat (Invertebrates):       21991 entries
mam.dat (Other Mammals):        2684 entries
mhc.dat (MHC proteins):         3601 entries
org.dat (Organelles):          12699 entries
phg.dat (Bacteriophages):       1604 entries
pln.dat (Plants):              12668 entries
pro.dat (Prokaryotes):         35857 entries
rod.dat (Rodents):              6346 entries
unc.dat (Unclassified):           88 entries
vrl.dat (Viruses):             44561 entries
vrt.dat (Other Vertebrates):    3650 entries

REM-TrEMBL (REMaining TrEMBL) contains the entries (28'440) that we do
not want to include in SWISS-PROT.


WEEKLY UPDATES OF TREMBL AND NON-REDUNDANT DATA SETS
====================================================
Weekly cumulative updates of TrEMBL are available by anonymous FTP and
from the EBI SRS server.
We also produce every week a complete non-redundant protein sequence
collection by providing three compressed files (these are in the
directory /pub/databases/sp_tr_nrdb on the EBI FTP server):
sprot.dat.Z, trembl.dat.Z and trembl_new.dat.Z.


ACCESS/DATA DISTRIBUTION
========================

FTP server:     ftp.ebi.ac.uk/pub/databases/trembl
SRS server:     http://srs.ebi.ac.uk/

TrEMBL is also available on the SWISS-PROT CD-ROM.
SWISS-PROT + TrEMBL is searchable on the following servers at the EBI:
FASTA3  (http://www2.ebi.ac.uk/fasta3/)
BLAST2  (http://www2.ebi.ac.uk/blast2/)
Bic_sw  (http://www2.ebi.ac.uk/bic_sw/)
Scanps  (http://www2.ebi.ac.uk/scanps/)
SSearch (http://www2.ebi.ac.uk/ssearch3/)



TREMBL HAS BEEN PREPARED BY:
============================

Rolf Apweiler, Sergio Contrino, Wolfgang Fleischmann, Henning Hermjakob,

Vivien Junker, Stephanie Kappus, Fiona Lang, Michele Magrane, Maria
Jesus Martin, Steffen Moeller, Nicoletta Mitaritonna and Claire
O'Donovan at the EMBL Outstation - European Bioinformatics Institute
(EBI) in Hinxton, UK;
Amos Bairoch and Alain Gateau at the Swiss Institute of Bioinformatics
in Geneva, Switzerland.


=======================================================================
Rolf Apweiler                           | SWISS-PROT Coordinator
EMBL Outstation                         | Email:apweiler at ebi.ac.uk
European Bioinformatics Institute (EBI) | URL:  http://www.ebi.ac.uk
Wellcome Trust Genome Campus, Hinxton   | Tel:  +44 (1223) 494435
Cambridge CB10 1SD, UK                  | Fax:  +44 (1223) 494968
========================================================================







More information about the Bioforum mailing list