Announcements of the Protein Information Resource

POSTMASTER at NBRF.Georgetown.Edu POSTMASTER at NBRF.Georgetown.Edu
Fri Aug 15 19:27:34 EST 1997


               Announcements of the Protein Information Resource
                               PIR-International
                                 6 August 1997

Highlights
1. Availability of PIR-International Release 53.00 and Associated Data Sets
2. Unique Features of the PIR-International Protein Sequence Database
3. The NRL_3D Database Updated
4. Format changes anticipated for Release 54.00
5. Ordering the Atlas of Protein and Genomic Sequences CD-ROM 

1.  Availability of PIR-International Release 53.00 and Associated Data Sets

The quarterly (June 30) releases of the PIR-International Protein Sequence 
Database, the NRL_3D database (corresponding to Brookhaven Protein Data Bank 
Release 79), and the ALN Database of Protein Sequence Alignments are available. 

<PRE>
Release Information for PIR-International Data Sets
==============================================================================
Data Set  Release Entries Residues Description
PIR1       53.00   13706   5125282 Section 1. Classified and Annotated Entries
PIR2       53.00   77275  24432750 Section 2. Annotated Entries
PIR3       53.00    3832    868136 Section 3. Unverified Entries
PIR4       53.00     238     43412 Section 4. Unencoded or Untranslated Entries
NRL_3D     21.00   10717   1897913 NRL Protein Sequences in Brookhaven PDB
ALN        16.00    2896           Database of Protein Sequence Alignments

PATCHX     53.00   96102  27691941 Available protein sequences not in PIR
ECOLI       4.10     592   3996197 Escherichia coli DNA Database
RESID      10.00     236           Residues annotated as features in PIR
------------------------------------------------------------------------------

Availability Information for the Data Sets
=============================================================================
Data Set | ATLAS CD-ROM | Magnetic Media | PIR WWW | PIR FTP Site | Online  |
PIR1     |      X       |       X        |    X    |      X       |    X    |
PIR2     |      X       |       X        |    X    |      X       |    X    |
PIR3     |      X       |       X        |    X    |      X       |    X    |
PIR4     |      X       |       X        |    X    |      X       |    X    |
NRL_3D   |      X       |       X        |    X    |      X       |    X    |
ALN      |      X       |                |    X    |              |         |
PATCHX   |      X       |                |         |              |    X    |
ECOLI    |      X       |                |         |              |         |
RESID    |      X       |                |         |              |         |
-----------------------------------------------------------------------------
</PRE>

The Complex Carbohydrate Structure Database (CCSD) and associated CarbBank 
program for Windows95/NT are also distributed on the Atlas of Protein and 
Genomic Sequences CD-ROM.

The PIR URL: http://www-nbrf.georgetown.edu/pir/
The PIR anonymous ftp site: nbrf.georgetown.edu

2.  Unique Fetaures of the PIR-International Protein Sequence Database

The PIR-International Protein Sequence Database is unique among comprehensive
public domain protein sequence databases in the following respects:

 * Beginning with Release 53.00, essentially all sequence entries are 
   classified into families (see below).

 * The PIR-International Protein Sequence Database contains more citations
   and more up-to-date data.

 * Full citations, including the titles of papers cited, are given.

 * The sequence reported in each citation is represented in a manner that
   clearly shows any differences from the sequence shown in the entry and
   allows the reported sequence to be reconstructed automatically.

 * Cross-references to the nucleotide sequence databases are directly
   associated with the citation on which they are based.

 * The most complete and current genetic information is provided, including
   map position, intron positions, and start codon (if different from ATG),
   along with pointers to genome databases.

 * Feature annotations are represented with greater accuracy and consistency
   because of format and terminology restrictions. The current Guide for PIR
   Features Annotations is publicly available.

 * It has consistently adhered to its announced update schedule. The PIR has
   been updated and publicly released 4 times per year for the last 13 years.

 * Public access is provided through our Web site and online system to the 
   interim updates normally prepared on a weekly basis (except during holiday 
   periods and during preparation of quarterly releases).

Dr. Friedhelm Pfeiffer at MIPS has clustered 93% of the sequences in the PIR
database into families whose members have about 50% or more sequence identity. 
Less than 5% of entries in the database are considered unclassifiable, usually
because they are too fragmentary. Only about 2% of entries are not fully
analyzed. Over 10,000 alignments of the families that contain at least two
sequences are available at the MIPS Web Site  http://www.mips.biochem.mpg.de/
Every family classified in this way has been assigned a permanent ID. About
half the sequences have been further clustered into superfamilies that have
also been assigned permanent IDs. The assignment of permanent IDs to
superfamilies allows users to keep track of superfamilies more easily. The
permanent family and superfamily numbers will shortly be available on the PIR
Web Site by clicking on the "Associated information" when viewing an entry.
This information will be available to VAX-VMS users of the Atlas CD-ROM and 
magnetic tapes.


3.  The NRL_3D Database Updated

The NRL_3D Database, produced by the PIR since 1989, is a database of protein
sequences with determined structures extracted from the Brookhaven Protein Data
Bank (PDB) coordinate data files.  It provides an interface between the Protein
Sequence Database and the PDB and provides access to the PDB data via
computerized sequence searching and comparison methods.  This release 
corresponding to the January 1997 release of the Brookhaven Protein Data Bank
was produced using the facilities of the National Cancer Institute Biomedical
Supercomputer Center, Frederick, MD, and we gratefully acknowledge the
cooperation of J.V. Maizel, Jr., S.K. Burt, and G.W. Smythers.


4. Format changes anticipated for Release 54.00

We anticipate that there will be some format changes in Release 54. If you want
to receive advance notice of the changes, please E-mail a request for the 
PIR Developer's Bulletin to PIRMAIL at NBRF.GEORGETOWN.EDU


5.  Ordering the Atlas of Protein and Genomic Sequences CD-ROM 

In addtion to the databases listed previously, the ATLAS CD-ROM also includes 
an installation guide, an ATLAS User's Guide, the FASTA package, and an 
Installation Manual and Tutorial for CarbBank. The ATLAS program, which 
accesses all of the other data sets on the CD-ROM, does not access the 
Complex Carbohydrate Structure Database.

Orders for the ATLAS CD-ROM are accepted, WITHOUT PREPAYMENT on institutional 
purchase orders, by FAX or E-mail.  For further information in the US and the
Americas, please contact:
                Kathryn Sidman, Technical Services Coordinator
                      Protein Information Resource (PIR)
                National Biomedical Research Foundation (NBRF)
                           3900 Reservoir Rd., NW
                              Washington DC 20007
                             FAX: (202) 687-1662
                            phone: (202) 687-2121
                     E-mail: PIRMAIL at nbrf.georgetown.edu

In Europe contact:
              Martinsried Institute for Protein Sequences (MIPS)
                    Max-Planck-Institute for Biochemistry
                          8033 Martinsried, Germany
                             FAX:  49 89 8578 2655
                            phone: 49 89 8578 2657
                   E-mail: mewes at ehpmic.mips.biochem.mpg.de

In Asia and Oceania contact:
           Japan International Protein Information Database (JIPID)
                         Science University of Tokyo
                        2669 Yamazaki, Noda 278 Japan
                             FAX:  81 47 122 1544 
                            phone: 81 48 124 1501
                       E-mail: Tsugita at JPNSUT31.BITNET

For information about CarbBank contact:
         CarbBank/CCSD
         114 W. Magnolia St.
         Suite 305
         Bellingham, WA  98225
         Phone:     (360) 671-8134
         EMail:     CarbBank at PacificRim.net

PIR is a registered mark of NBRF
------------------------------------------------------------------------
                                 Dr. John S. Garavelli
                                 Associate Director
                                 Protein Information Resource
                                 National Biomedical Research Foundation
                                 Washington, DC  20007
                                 PIRMAIL at NBRF.GEORGETOWN.EDU




More information about the Bionews mailing list