EMBL File Server News No. 2

Fri Dec 21 18:09:00 EST 1990

|  EMBL FILE SERVER News                        Number 2, December 21th 1990 |
|                                                                            |
|  European Molecular Biology Laboratory, Data Library & Computer Group,     |
|  Postfach 10.2209, 6900 Heidelberg, Germany.                               |
|  E-mail: NetHelp at EMBL.bitnet   Tel: +49 6221 387258   Fax: +49 6221 387306 |


<1> Introduction
<2> New and updated SWISS-PROT entries
<3> Changes to TFD
<4> Common questions
<5> Updates to existing data collections
<6> Updates to software collection
<7> Other updates
<8> Summary of contents of file server
<9> Getting started ?

<1> Introduction
    The EMBL File Server Newsletter summarises changes to the EMBL File Server.

    This newsletter and older issues are available from the server (eg. GET

<2> New and updated SWISS-PROT entries
    New SWISS-PROT entries and updates to existing entries are now available in
    between regular relases. They are not provided on a daily basis like new
    nucleotide entries, but we intend to make at least one or two sets of
    new/updated entries available each month. Indices are provided for the
    latest full release, and also separate indices for the data new since
    then (see HELP PROT for further details).

<3> Changes to TFD
    The files in the TFD directory (D. Ghosh's relational Transciption Factor
    Database) contained records with much more than 80 characters per line.
    This fact caused some severe problems during mail transfer. Therefore, we
    have now encoded the TFD files in "uuencode" format. The C source code for
    the decoding program UUD (UUD.C) is available for VMS, UNIX and DOS
    machines in the directories VAX_SOFTWARE,UNIX_SOFTWARE and DOS_SOFTWARE.
    See HELP SOFTWARE for more details.

<4> Common questions

    Q: How can I search for keywords, species, authors etc. ?

    A: You cannot really do these kinds of searches using the EMBL File
    server, and we are somewhat reluctant to build in this capability.
    Interactive access to the database is much more likely to provide
    satisfactory query ability. The sequence databases are
    distributed at very low cost on both CD-ROM and magnetic tape, and
    data is also distributed via computer networks to several national

    The EMBL CD-ROM contains flexible query/retrieval for MSDOS systems that
    is designed to query the sequence  databases (EMBL/SWISS-PROT) by
    accession numbers, entry names, free text, authors, citations,
    database cross-references, feature keys etc. The databases are also
    in a format suitable for sequence similarity searches with software
    such as FASTA.

    EMBL is also involved in a project to form a European Molecular
    Biology Network (EMBnet). Several centres have been set up
    to run a national biocomputing service: this includes enabling
    access to the latest sequence data distributed daily to them by
    EMBL. ('GET DOC:EMBNET.DOC' for further details).

    Q: Why have I only received the last part of a file ?

    A: Some mailers refuse to transfer mail files which exceed a certain size
    limit. The BITNET recommendation for the maximum file size is 256 KBytes
    but many mailers have much smaller limits. To accommodate most users,
    the file server automatically splits large files into parts of 95 Kbytes.
    Any standard editor can be used to remove the mail headers and join the
    parts. The individual parts may arrive in random order, but the Subject
    line gives you the necessary information to join them in the correct order
    (part n of m). Large files are transported slower through the networks than
    small ones, and the last part of a package will therefore arrive first in
    most cases. Depending on the network traffic it may actually take a few
    hours or even days, until the other parts arrive. If you don't receive some
    parts at all, there is probably a computer system between EMBL and you
    which does not allow transfer of 95 Kbytes files. There is not much we can
    do at EMBL in these cases, but you should check with your local computer
    specialists whether there are any local limitations at your site.

<5> Updates to Existing Data Collections
    The following data collections have been updated recently:

    NUC         -  Release 25 November 1990 of the EMBL Nucleotide Sequence
    PROT        -  Release 16 October 1990 of the SWISS-PROT Protein Sequence
    EPD         -  Release 25 November 1990 of Philipp Bucher's Eukaryotic
                   Promoter Database
    REBase      -  Release 9012 December 1990, of Rich Robert's restriction
                   enzyme database.
    ECD         -  Release 5.0 of Manfred Kroeger's E.coli database
    Prosite     -  Release 6.0 of Amos Bairoch's protein pattern database.
    Enzyme      -  Release 3.0 of Amos Bairoch's enzyme database.
    RefList     -  Release 13.0 of Amos Bairoch's SeqAnalRef database.
    TFD         -  Release 2.0 of David Ghosh's transcription factor database

    The directory NUC is continually updated with nucleotide sequence data
    from EMBL/GenBank/DDBJ.

<6> Updates to Software Collection
    Here is a list of new (N) molecular biological programs or updates (U):


    COSY.UAA            (U) Complete package for enzyme kinetics
                            (update from 6-Nov-1990) (M. Eberhard)

    CREGEX.C            (N) Utility program to reformat Prosite for use with
                            L. Kolakowski's PROSEARCH program (J. Leunissen)


    DNATRANSLATOR.HQX   (N) HyperCard stack with utilities for phylogenetic
                            analyses (D.J. Eernisse)

    ENDOCYTOSIS.HQX     (N) Calculation of parameters of endocytosis reaction
                            (R.E. Williams)

    LOOPVIEWER.HQX      (N) Graphical rpresentation of RNA folding (D. Gilbert)

    MACLIGAND.HQX       (N) Calculation of parameters of ligand binding

    MACMOLECULE.HQX     (N) 3D models of biomelecules (E. Myers et al.)

    MACPATTERN.HQX      (U) Protein pattern searching with Prosite (v1.2.1)
                            (R. Fuchs)

    MULFOLD.HQX         (N) RNA folding prediction (M. Zuker)

    RIMANAGER.HQX       (N) Analysis of genetic mapping experiments with
                            recombinant mouse strains (K. Manly)

    SPEAKQUENCER.HQX    (N) Sequence data entry with acoustic feedback
                            (C. Fritze)

    STUFFIT_16.HQX      (N) New version of archiver/binhexer (R. Lau)


    BLAST.UAA           (U) NCBI's fast database searching package

    CREGEX.C            (N) Utility program to reformat Prosite for use with
                            L. Kolakowski's PROSEARCH program (J. Leunissen)

    SIM.UUE             (N) Local similarity searching (G. Huang and W. Miller)

    TREEALIG.UAA        (U) TreeAlign multiple sequence alignment (J. Hein)


    CDACCESS.UAA        (U) New version 2.03 of ISO driver software
                            (P.A. Stockwell)

    CREGEX.C            (N) Utility program to reformat Prosite for use with
                            L. Kolakowski's PROSEARCH program (J. Leunissen)

    FASTEMBL.COM        (U) New version 1.1 of DCL shell for EMBL Mail-FASTA
                            access (E.L. Sonnhammer)

    SCRUTINE.UAA        (U) v2 of Scrutineer protein database analysis package
                            (P. Sibbald)

    TREEALIG.UAA        (U) TreeAlign multiple sequence alignment (J. Hein)

<7> Other updates

    DOC         -  EMBnet documentation, October 1990 (DOC:EMBNET.DOC)
                -  Compilation of available servers for molecular
                   biology, from Michael Gribskov (DOC:SERVER.TXT).

<8> Summary of Directories of the EMBL File Server


    Summary of directories available on the EMBL File Server:


    EMBL Nucleotide Sequence Database                              NUC
      (Rel. 25, Nov 90 + new data from EMBL/GenBank/DDBJ)
    Eukaryotic Promotor Database (Rel. 25, Nov 90)                 EPD
    SwissProt Protein Database (Rel. 16, Oct 90)                   PROT
    ProSite pattern database (Rel. 6.0, Nov 90)                    PROSITE
    ENZYME database (Rel. 3, Dec 90)                               ENZYME
    Brookhaven Protein Structure Database (Rel. 53, Jul 90)        PROTEINDATA
    REBASE, Restriction Enzyme Database (Rel. 9012, Dec 90)        REBASE
    TFD, Transcription Factor Database (Ver 2.0)                   TFD
    The E.coli Database (Rel. 5, Nov 90)                           ECD
    Drosophila Genetic Map Database (Rel. 3.5, Aug 90)             DROSOPHILA
    Listing of Molecular Biology Databases, LiMB (Rel. 2.0)        LIMB
    Sequence analysis bibliography (SEQANALREF Rel. 13.0, Dec 90)  REFLIST


    Software for MS-DOS computers                                  DOS_SOFTWARE
    Software for Apple Macintosh                                   MAC_SOFTWARE
    Software for UNIX                                              UNIX_SOFTWARE
    Software for VAX/VMS                                           VAX_SOFTWARE
    Other software, GenBank Clearinghouse, etc.                    MISC_SOFTWARE


    Technical documents, submission and order forms, etc.          DOC
    Multiple DNA sequence alignments and consensus sequences       ALIGN
    Codon Usage tables                                             CODONUSAGE

<9> Getting Started ?
    The EMBL File Server is a facility available on the EMBL computing system
    for external users to request files by electronic mail. The service is free.

    For initial information, send standard electronic mail to the address
    NETSERV at EMBL.bitnet
    containing just the word HELP on a line by itself. No essays please.
    For human contact, send electronic mail to NetHelp at EMBL.bitnet.

More information about the Embl-db mailing list

Send comments to us at biosci-help [At] net.bio.net