------------------------------------------------------------------------------
| EMBL FILE SERVER News Number 2, December 21th 1990 |
| |
| European Molecular Biology Laboratory, Data Library & Computer Group, |
| Postfach 10.2209, 6900 Heidelberg, Germany. |
| E-mail: NetHelp at EMBL.bitnet Tel: +49 6221 387258 Fax: +49 6221 387306 |
------------------------------------------------------------------------------
Contents:
<1> Introduction
<2> New and updated SWISS-PROT entries
<3> Changes to TFD
<4> Common questions
<5> Updates to existing data collections
<6> Updates to software collection
<7> Other updates
<8> Summary of contents of file server
<9> Getting started ?
<1> Introduction
-------------
The EMBL File Server Newsletter summarises changes to the EMBL File Server.
This newsletter and older issues are available from the server (eg. GET
DOC:EMBL_Server_News.1).
<2> New and updated SWISS-PROT entries
----------------------------------
New SWISS-PROT entries and updates to existing entries are now available in
between regular relases. They are not provided on a daily basis like new
nucleotide entries, but we intend to make at least one or two sets of
new/updated entries available each month. Indices are provided for the
latest full release, and also separate indices for the data new since
then (see HELP PROT for further details).
<3> Changes to TFD
--------------
The files in the TFD directory (D. Ghosh's relational Transciption Factor
Database) contained records with much more than 80 characters per line.
This fact caused some severe problems during mail transfer. Therefore, we
have now encoded the TFD files in "uuencode" format. The C source code for
the decoding program UUD (UUD.C) is available for VMS, UNIX and DOS
machines in the directories VAX_SOFTWARE,UNIX_SOFTWARE and DOS_SOFTWARE.
See HELP SOFTWARE for more details.
<4> Common questions
----------------
Q: How can I search for keywords, species, authors etc. ?
A: You cannot really do these kinds of searches using the EMBL File
server, and we are somewhat reluctant to build in this capability.
Interactive access to the database is much more likely to provide
satisfactory query ability. The sequence databases are
distributed at very low cost on both CD-ROM and magnetic tape, and
data is also distributed via computer networks to several national
hosts.
The EMBL CD-ROM contains flexible query/retrieval for MSDOS systems that
is designed to query the sequence databases (EMBL/SWISS-PROT) by
accession numbers, entry names, free text, authors, citations,
database cross-references, feature keys etc. The databases are also
in a format suitable for sequence similarity searches with software
such as FASTA.
EMBL is also involved in a project to form a European Molecular
Biology Network (EMBnet). Several centres have been set up
to run a national biocomputing service: this includes enabling
access to the latest sequence data distributed daily to them by
EMBL. ('GET DOC:EMBNET.DOC' for further details).
Q: Why have I only received the last part of a file ?
A: Some mailers refuse to transfer mail files which exceed a certain size
limit. The BITNET recommendation for the maximum file size is 256 KBytes
but many mailers have much smaller limits. To accommodate most users,
the file server automatically splits large files into parts of 95 Kbytes.
Any standard editor can be used to remove the mail headers and join the
parts. The individual parts may arrive in random order, but the Subject
line gives you the necessary information to join them in the correct order
(part n of m). Large files are transported slower through the networks than
small ones, and the last part of a package will therefore arrive first in
most cases. Depending on the network traffic it may actually take a few
hours or even days, until the other parts arrive. If you don't receive some
parts at all, there is probably a computer system between EMBL and you
which does not allow transfer of 95 Kbytes files. There is not much we can
do at EMBL in these cases, but you should check with your local computer
specialists whether there are any local limitations at your site.
<5> Updates to Existing Data Collections
------------------------------------
The following data collections have been updated recently:
NUC - Release 25 November 1990 of the EMBL Nucleotide Sequence
Database
PROT - Release 16 October 1990 of the SWISS-PROT Protein Sequence
Database
EPD - Release 25 November 1990 of Philipp Bucher's Eukaryotic
Promoter Database
REBase - Release 9012 December 1990, of Rich Robert's restriction
enzyme database.
ECD - Release 5.0 of Manfred Kroeger's E.coli database
Prosite - Release 6.0 of Amos Bairoch's protein pattern database.
Enzyme - Release 3.0 of Amos Bairoch's enzyme database.
RefList - Release 13.0 of Amos Bairoch's SeqAnalRef database.
TFD - Release 2.0 of David Ghosh's transcription factor database
The directory NUC is continually updated with nucleotide sequence data
from EMBL/GenBank/DDBJ.
<6> Updates to Software Collection
------------------------------
Here is a list of new (N) molecular biological programs or updates (U):
DOS:
-----
COSY.UAA (U) Complete package for enzyme kinetics
(update from 6-Nov-1990) (M. Eberhard)
CREGEX.C (N) Utility program to reformat Prosite for use with
L. Kolakowski's PROSEARCH program (J. Leunissen)
Mac:
----
DNATRANSLATOR.HQX (N) HyperCard stack with utilities for phylogenetic
analyses (D.J. Eernisse)
ENDOCYTOSIS.HQX (N) Calculation of parameters of endocytosis reaction
(R.E. Williams)
LOOPVIEWER.HQX (N) Graphical rpresentation of RNA folding (D. Gilbert)
MACLIGAND.HQX (N) Calculation of parameters of ligand binding
(R.E.Williams)
MACMOLECULE.HQX (N) 3D models of biomelecules (E. Myers et al.)
MACPATTERN.HQX (U) Protein pattern searching with Prosite (v1.2.1)
(R. Fuchs)
MULFOLD.HQX (N) RNA folding prediction (M. Zuker)
RIMANAGER.HQX (N) Analysis of genetic mapping experiments with
recombinant mouse strains (K. Manly)
SPEAKQUENCER.HQX (N) Sequence data entry with acoustic feedback
(C. Fritze)
STUFFIT_16.HQX (N) New version of archiver/binhexer (R. Lau)
UNIX:
-----
BLAST.UAA (U) NCBI's fast database searching package
LIBNCBI.UAA
DFA.UAA
CREGEX.C (N) Utility program to reformat Prosite for use with
L. Kolakowski's PROSEARCH program (J. Leunissen)
SIM.UUE (N) Local similarity searching (G. Huang and W. Miller)
TREEALIG.UAA (U) TreeAlign multiple sequence alignment (J. Hein)
VAX/VMS:
--------
CDACCESS.UAA (U) New version 2.03 of ISO driver software
(P.A. Stockwell)
CREGEX.C (N) Utility program to reformat Prosite for use with
L. Kolakowski's PROSEARCH program (J. Leunissen)
FASTEMBL.COM (U) New version 1.1 of DCL shell for EMBL Mail-FASTA
access (E.L. Sonnhammer)
SCRUTINE.UAA (U) v2 of Scrutineer protein database analysis package
(P. Sibbald)
TREEALIG.UAA (U) TreeAlign multiple sequence alignment (J. Hein)
<7> Other updates
-------------
DOC - EMBnet documentation, October 1990 (DOC:EMBNET.DOC)
- Compilation of available servers for molecular
biology, from Michael Gribskov (DOC:SERVER.TXT).
<8> Summary of Directories of the EMBL File Server
-------------------------------------------
DIR [GENERAL]
Summary of directories available on the EMBL File Server:
Databases:
EMBL Nucleotide Sequence Database NUC
(Rel. 25, Nov 90 + new data from EMBL/GenBank/DDBJ)
Eukaryotic Promotor Database (Rel. 25, Nov 90) EPD
SwissProt Protein Database (Rel. 16, Oct 90) PROT
ProSite pattern database (Rel. 6.0, Nov 90) PROSITE
ENZYME database (Rel. 3, Dec 90) ENZYME
Brookhaven Protein Structure Database (Rel. 53, Jul 90) PROTEINDATA
REBASE, Restriction Enzyme Database (Rel. 9012, Dec 90) REBASE
TFD, Transcription Factor Database (Ver 2.0) TFD
The E.coli Database (Rel. 5, Nov 90) ECD
Drosophila Genetic Map Database (Rel. 3.5, Aug 90) DROSOPHILA
Listing of Molecular Biology Databases, LiMB (Rel. 2.0) LIMB
Sequence analysis bibliography (SEQANALREF Rel. 13.0, Dec 90) REFLIST
Software:
Software for MS-DOS computers DOS_SOFTWARE
Software for Apple Macintosh MAC_SOFTWARE
Software for UNIX UNIX_SOFTWARE
Software for VAX/VMS VAX_SOFTWARE
Other software, GenBank Clearinghouse, etc. MISC_SOFTWARE
Miscellaneous:
Technical documents, submission and order forms, etc. DOC
Multiple DNA sequence alignments and consensus sequences ALIGN
Codon Usage tables CODONUSAGE
<9> Getting Started ?
-----------------
The EMBL File Server is a facility available on the EMBL computing system
for external users to request files by electronic mail. The service is free.
For initial information, send standard electronic mail to the address
NETSERV at EMBL.bitnet
containing just the word HELP on a line by itself. No essays please.
For human contact, send electronic mail to NetHelp at EMBL.bitnet.