Announcements of PIR Network Request Service

Mon Aug 3 19:19:32 EST 1992

              Announcements of the Protein Identification Resource
                            Network Request Service

1. PATCHX Supplements PIR with Sequences from Other Databases
2. Feature Information from Brookhaven Data Bank in NRL_3D Database 9.1
3. Complimentary CD-ROM Available with ATLAS Multidatabase Retrieval Program
4. New USE FORMAT Server Command Provides Versatile Output
5. GenBank and EMBL Database Sections
6. PIR Network Request Service Command Summary

1. PATCHX Supplements PIR with Sequences from Other Databases

The PATCHX database produced by MIPS at the Max Planck Institute for
Biochemistry, Martinsreid, FRG.  The PATCHX database includes all protein
sequences (not identical with or contained in sequences from PIR1, PIR2 and
PIR3 release 32.2) from the following databases:

  Database   Release  Date  Entries  Code  Description

  MIPSOwn    33.0     6-92   1251    D     MIPS preliminary entries
  PIRMOD     33.0     6-92     32    E     MIPS/PIR preliminary entries
  MIPSH      32.2     6-92     65    F     MIPS yeast entries
  NRL_3D      8.0     3-92    247    R     Brookhaven Data Bank Sequences
  MIPSTrn    33.0     6-92   1130    G     MIPS preliminary translations
  EMTrans    30.0     5-92  12756    H     EMBL automatic translations
  SwissProt  21.0     2-92   1618    I     SwissProt entries
  GenPept    71.0     3-92   4603    J     GenBank automatic translations
  Kabat       5.0     3-92   3567    K     Kabat entries
  PSeqIP      5.0     7-88    956    L     NEWAT
                                     M     PSD
                                     N     PGTrans

All sequences that are IDENTICAL within or between databases are present ONCE.
Duplicate sequences and sequences that were completely contained within others
(subsequences) have been eliminated according to the priority (top to bottom)
in the table above.  The number of entries in the table reflects the number of
entries remaining from that database after elimination of duplicates and
subsequences, not the original number of entries.  There still remain numerous
inexact duplicates in PATCHX, multiple reports of the same protein that have at
least one amino acid residue difference.  Many of these are cited in merged PIR
entries.  The PIR3, MIPSOwn, PIRMOD and MIPSTrn databases contain preliminary
data that should be used with extreme caution.

The PATCHX database is available through the PIR Network Request Server,
through the PIR On-Line system and on the ATLAS CD-ROM now being distributed.

Friedhelm Pfeiffer at MIPS wishes to thank Reinhard Doelz and Hans
Ullitz-Moeller for their valuable suggestions in the production of this

2. NRL_3D Release 9.1 Has Feature Information from Brookhaven Data Bank

The NRL_3D Database of sequence information extracted from the Brookhaven
Protein Data Bank (PDB) has been upgraded to release 9.1.  This new version
includes feature annotations extracted from PDB HELIX, SHEET, TURN, SITE, and
SSBOND records along with special ATOM and HETATM records.  New algorithms
have been implemented to construct and name chains and fragments, to recognize
non-standard residues and to discard entries with completely unknown sequence.
NRL_3D release 9.1 corresponds to PDB release 60 (May 1992) and contains
1,380 sequences with 229,099 residues.

The inclusion of this feature information in NRL_3D allows PDB entries to be
recovered through the FEATURE command.  For example the commands
will list all entries in the NRL_3D database with a "type I" turn annotated
in their corresponding PDB entry.

Release 9.1 of NRL_3D is available through the PIR Network Request Server,
through the PIR On-Line Access System and by FTP from the University of Houston
server at in the files

Our thanks to Bill Pearson and Dan Davison for their efforts in providing FTP
access to the PIR databases.

3. Complimentary CD-ROM Available with ATLAS Multidatabase Retrieval Program

A preliminary version of the ATLAS CD-ROM is being distributed on a
complimentary basis as an introduction.  Regular distribution of the
ATLAS CD-ROM is expected to begin in the Fall, coordinated with the quarterly
releases of the PIR-International Protein Sequence Database.  To receive a
complimentary ATLAS CD-ROM, please send your name and complete mailing address

The ATLAS CD-ROM contains the Atlas Retrieval System, the PIR-International
Protein Sequence Database, the GenBank Gene Sequence Database, and several
related databases.  The Atlas Retrieval System (ATLAS) is an information
retrieval system specifically designed to access macromolecular sequence
databases.  It provides simultaneous retrieval from all (or a selected subset)
of these databases.  The Atlas program is currently designed to run on PC/DOS
and VAX/VMS computer systems. Support for UNIX and MAC systems will be added.

The development of the ATLAS program was partially supported by NLM LM05206-09,
by NSF BIR-9107540, and by Digital Equipment Corporation.  The ATLAS program is
copyrighted by the National Biomedical Research Foundation.  The ATLAS of
Protein and Genomic Sequences is a trademark of the National Biomedical
Research Foundation.

The ATLAS program was developed from the NBRF eXperimental Query System (XQS)
and is designed along similar lines; it does not contain some of the utility
functions of the XQS program; these will be added later as portability permits.

VAX/VMS systems currently do not support direct access to ISO 9660 formatted
CD-ROMs.  The ATLAS CD-ROM may be accessed on VAX/VMS systems by two

(1) There is an ISO 9660 compliant device driver available from Digital
    Equipment Corporation (DEC) that allows direct access to the CD-ROM
    (driver part number YT-GS001-01).  Please contact your DEC sales
    representative for further information.

(2) There is a public domain utility for accessing ISO 9660 CD-ROMs,
    called CD_ACCESS, written by Peter Stockwell, University of Otago,
    New Zealand, that will allow all the files on the CD-ROM to be copied
    to a magnetic disk drive.  This utility can be obtained from the EMBL
    E-mail server (for further information contact DataLib at EMBL-Heidelberg.DE).
    When copying files using CD_ACCESS, be sure to use the /BINARY qualifier
    to the copy command.

4. New USE FORMAT Server Command Provides Versatile Output

The PIR Network Server now provides a command for changing the default format
of PIR-International database entries.  The default format for PIR entries
conforms to CODATA specifications.  To obtain PIR entries in the format
normally presented by PIR database retrieval programs (PSQ, XQS and ATLAS)
use the command
Subsequent GET commands will then return entries in the ATLAS format.
The command
will cause subsequent GET commands to return entries in the default CODATA

5. GenBank and EMBL Database Components

To facilitate program access, the GenBank and EMBL databases have been broken
into sections.  GenBank is available in three sections, GB, GBSUP and GBNEW,
and EMBL is available in two sections, EMBL and EMBLSUP.  The GBNEW section
contains the GenBank weekly update entries.  The GBSUP and EMBLSUP contain
regular entries in supplemental sections (presently these are the primate
entries).  All these databases are automatically available through all the
commands that can use them.  Particular databases may be selected with the
USE BASES command.  The command
will select all the GenBank databases, and only those databases, for
subsequent database query and retrieval commands.  The command
will select all the nucleotide sequence databases for subsequent query and
retrieval commands.

6. PIR Network Request Service Command Summary

The National Biomedical Research Foundation Protein Identification Resource
network request service is a full-function fileserver and database query
system.  It has been operating since August 1990 and is capable of handling
database queries, sequence searches and sequence submissions, in addition to
fileserver requests.  To use this server, request commands should be sent to
FILESERV at GUNBRF on BITNET.  The FILESERVer recognizes the following commands
sent either in a mail message, or (if the sender is on BITNET) in command
messages or in a file:

  Command        Action
  -------        -----------------------------------------------
  ACCESSION      list entry codes and titles by accession number
  AND            combine QUERY commands with Boolean AND
  AUTHOR         list entry codes and titles by author
  BASES          list accessible databases
  CROSS          list PIR entry codes and titles corresponding to
                 a particular nucleic sequence database entry
  DEPOSIT        deposit entry for database submission
    END DEPOSIT  terminate deposit entry
  FEATURE        list entry codes and titles by feature table entry
  GENE           list entry codes and titles for a gene name
  GET            return entry by entry code
  HELP           return HELP instructions
  HOST           list entry codes and titles by host species
  INDEX          list SENDable files
  JOURNAL        list entry codes and titles by journal citation
  KEYWORD        list entry codes and titles by keyword
  MEMBER         list alignments containing entry code as a member
  NOT            combine QUERY commands with Boolean NOT
  OR             combine QUERY commands with Boolean OR
  QUERY          begin collecting QUERY commands
    END QUERY    terminate collecting commands and execute QUERY
  QUIT           ignore the remaining text (E-mail signature blocks)
  RETURN         change return address for gateway mail
  SEARCH         search for sequence by FASTA procedure
    END SEARCH   terminate sequence for searching
  SEND           send file
  SPECIES        list entry codes and titles by species
  SUGGEST        leave suggestion or correction for PIR staff
    END SUGGEST  terminate suggestion text
  SUPERFAMILY    list entry codes and titles by superfamily name
  TAXONOMY       report taxonomy for scientific or common name
  TITLE          list entry codes and titles by title

