total number of bases?
embnet at comp.bioz.unibas.ch
Wed May 4 16:16:07 EST 1994
: > I can't seem to find a reliable current estimate of the total
: > number of bases in all the different sequences stored in
: > readily accessible databases (genbank, embl...). Also,
the number of sequences which you have on AEOLUS (your computer running GCG)
should be the following:
GENBANK exclusion set (GENBANK 82 - EMBL 38 with GCG)
and the weekly updates from EMBnet Switzerland
901 gb_new.seq - all new GENBANK not in the EMBL updates
7717 xembl.seq - all really new EMBL entries
7250 xxembl.seq - all entries updated by EMBL wrt last release
I wouldn't use the basepair numbers, though, as mentioned below,
for statistics as the data are based on ACCESSION numbers and therefore
get you a lot of redundancies.
: The size of current releases of GenBank and EMBL is:
: GenBank Release 82 (15 April 1994): 180,589,455 bases; 169,896 sequences;
: EMBL Library Release 38 (March 1994): 179,346,566 bases; 171,787 sequences;
: The NCBI maintains a non-redundant database daily updated
: nr Non-redundant PDB+GBUpdate+GenBank+EmblUpdate+EMBL:
: 5:07 AM EDT May 3 1994: 184,980,203 bases; 173,749 sequences;
The HASSLE server of EMBnet Switzerland recalculates a 'nr' for both
proteins and DNA on a weekly basis. Last saturday we had, based on
EMBL with Genbank added, and EMBL updates with Genbank updates added,
slightly less than the data reported above (but this was from April 30).
(specifically to Stephane)
Unfortunately, the host you use runs a TCP/IP product which doesn't
support HASSLE at the moment (Wollongong), but times may come where
you support UCX, Multinet, or TCPware. If you need an account on the
EMBnet Switzerland UNIX cluster, let me know.
HASSLE is available for most flavours of UNIX and the VMS emulations
of IP mentioned above. Contact us for details - both customer and server
mode are supported in full source. Services within EMBnet running via
HASSLE are BLAST, (T)FASTA, PROFILE and S&W search (via the Biocellerator
at EMBnet Israel at Weizmann/Rehovot) and MOWSE (from EMBnet UK at Daresbury).
MEDFETCH is a first SRS type gateway to ENTREz-based Swissprot, and
FETCH gets database entries in GCG format.
| EMBnet SWITZERLAND | RFC embnet at comp.bioz.unibas.ch |
| Biocomputing | (small) FTP and GOPHER server |
More information about the Embl-db