Second Beta release of TREMBL, a protein sequence database supplementing
the SWISS-PROT Protein Sequence Data Bank
INTRODUCTION
============
TREMBL is a protein sequence database supplementing the SWISS-PROT Protein
Sequence Data Bank. TREMBL contains the translations of all coding
sequences (CDS) present in the EMBL Nucleotide Sequence Database not
integrated in SWISS-PROT. At the moment we treat TREMBL as an independent
dataset in SWISS-PROT format, but shortly TREMBL will become a part of
SWISS-PROT.
SECOND BETA RELEASE OF TREMBL
=============================
This TREMBL release is created from the EMBL Nucleotide Sequence Database
release 47 and contains 107'065 sequence entries, comprising 28'667'743 amino
acids.
TREMBL is split in two main sections; SP-TREMBL and REM-TREMBL:
SP-TREMBL (SWISS-PROT TREMBL) contains the entries (89'883) which should be
incorporated into SWISS-PROT. SP-TREMBL is partially redundant against
SWISS-PROT, since approximately 40'000 SP-TREMBL entries are only
additional sequence reports of proteins already in SWISS-PROT. We will try
to merge these sequence reports as fast as possible with the already
existing SWISS-PROT entries for these proteins, so as to make SWISS-PROT
and TREMBL completely nonredundant.
SP-TREMBL is organized in subsections:
fun.dat (Fungi): 4939 entries
hum.dat (Human): 5985 entries
inv.dat (Invertebrates): 11476 entries
mam.dat (Other Mammals): 2096 entries
mhc.dat (MHC proteins): 2221 entries
org.dat (Organelles): 6466 entries
phg.dat (Bacteriophages): 928 entries
Pln.dat (Plants): 5790 entries
pro.dat (Prokaryotes): 17393 entries
rod.dat (Rodents): 5473 entries
unc.dat (Unclassified): 243 entries
vrl.dat (Viruses): 24595 entries
vrt.dat (Other Vertebrates): 2278 entries
REM-TREMBL (REMaining TREMBL) contains the entries (17'182) that we do
not want to include in SWISS-PROT.
ACCESS/DATA DISTRIBUTION
========================
FTP server: ftp.ebi.ac.uk/pub/databases/trembl
TREMBL is also available on the SWISS-PROT CD-ROM.
TREMBL HAS BEEN PREPARED BY:
============================
Rolf Apweiler, Alain Gateau, Vivien Junker, Fiona Lang, Claire O'Donovan,
and Nicoletta Mitaritonna at the EMBL Outstation - European Bioinformatics
Institute (EBI) in Hinxton, UK;
Amos Bairoch at the Medical Biochemistry Department of the University
of Geneva.
=======================================================================
Rolf Apweiler |
EMBL Outstation | Email:apweiler at ebi.ac.uk
European Bioinformatics Institute (EBI) | URL: http://www.ebi.ac.uk
Wellcome Trust Genome Campus, Hinxton | Tel: +44 (1223) 494435
Cambridge CB10 1SD, UK | Fax: +44 (1223) 494968
========================================================================