Announcements of the Protein Information Resource
POSTMASTER at NBRF.Georgetown.Edu
POSTMASTER at NBRF.Georgetown.Edu
Thu Aug 7 19:11:48 EST 1997
Announcements of the Protein Information Resource
PIR-International
6 August 1997
Highlights
1. Availability of PIR-International Release 53.00 and Associated Data Sets
2. Unique Features of the PIR-International Protein Sequence Database
3. The NRL_3D Database Updated
4. Format changes anticipated for Release 54.00
5. Ordering the Atlas of Protein and Genomic Sequences CD-ROM
1. Availability of PIR-International Release 53.00 and Associated Data Sets
The quarterly (June 30) releases of the PIR-International Protein Sequence
Database, the NRL_3D database (corresponding to Brookhaven Protein Data Bank
Release 79), and the ALN Database of Protein Sequence Alignments are available.
Release Information for PIR-International Data Sets
==============================================================================
Data Set Release Entries Residues Description
PIR1 53.00 13706 5125282 Section 1. Classified and Annotated Entries
PIR2 53.00 77275 24432750 Section 2. Annotated Entries
PIR3 53.00 3832 868136 Section 3. Unverified Entries
PIR4 53.00 238 43412 Section 4. Unencoded or Untranslated Entries
NRL_3D 21.00 10717 1897913 NRL Protein Sequences in Brookhaven PDB
ALN 16.00 2896 Database of Protein Sequence Alignments
PATCHX 53.00 96102 27691941 Available protein sequences not in PIR
ECOLI 4.10 592 3996197 Escherichia coli DNA Database
RESID 10.00 236 Residues annotated as features in PIR
------------------------------------------------------------------------------
Availability Information for the Data Sets
=============================================================================
Data Set | ATLAS CD-ROM | Magnetic Media | PIR WWW | PIR FTP Site | Online |
PIR1 | X | X | X | X | X |
PIR2 | X | X | X | X | X |
PIR3 | X | X | X | X | X |
PIR4 | X | X | X | X | X |
NRL_3D | X | X | X | X | X |
ALN | X | | X | | |
PATCHX | X | | | | X |
ECOLI | X | | | | |
RESID | X | | | | |
-----------------------------------------------------------------------------
The Complex Carbohydrate Structure Database (CCSD) and associated CarbBank
program for Windows95/NT are also distributed on the Atlas of Protein and
Genomic Sequences CD-ROM.
The PIR URL: http://www-nbrf.georgetown.edu/pir/
The PIR anonymous ftp site: nbrf.georgetown.edu
2. Unique Fetaures of the PIR-International Protein Sequence Database
The PIR-International Protein Sequence Database is unique among comprehensive
public domain protein sequence databases in the following respects:
* Beginning with Release 53.00, essentially all sequence entries are
classified into families (see below).
* The PIR-International Protein Sequence Database contains more citations
and more up-to-date data.
* Full citations, including the titles of papers cited, are given.
* The sequence reported in each citation is represented in a manner that
clearly shows any differences from the sequence shown in the entry and
allows the reported sequence to be reconstructed automatically.
* Cross-references to the nucleotide sequence databases are directly
associated with the citation on which they are based.
* The most complete and current genetic information is provided, including
map position, intron positions, and start codon (if different from ATG),
along with pointers to genome databases.
* Feature annotations are represented with greater accuracy and consistency
because of format and terminology restrictions. The current Guide for PIR
Features Annotations is publicly available.
* It has consistently adhered to its announced update schedule. The PIR has
been updated and publicly released 4 times per year for the last 13 years.
* Public access is provided through our Web site and online system to the
interim updates normally prepared on a weekly basis (except during holiday
periods and during preparation of quarterly releases).
Dr. Friedhelm Pfeiffer at MIPS has clustered 93% of the sequences in the PIR
database into families whose members have about 50% or more sequence identity.
Less than 5% of entries in the database are considered unclassifiable, usually
because they are too fragmentary. Only about 2% of entries are not fully
analyzed. Over 10,000 alignments of the families that contain at least two
sequences are available at the MIPS Web Site http://www.mips.biochem.mpg.de/
Every family classified in this way has been assigned a permanent ID. About
half the sequences have been further clustered into superfamilies that have
also been assigned permanent IDs. The assignment of permanent IDs to
superfamilies allows users to keep track of superfamilies more easily. The
permanent family and superfamily numbers will shortly be available on the PIR
Web Site by clicking on the "Associated information" when viewing an entry.
This information will be available to VAX-VMS users of the Atlas CD-ROM and
magnetic tapes.
3. The NRL_3D Database Updated
The NRL_3D Database, produced by the PIR since 1989, is a database of protein
sequences with determined structures extracted from the Brookhaven Protein Data
Bank (PDB) coordinate data files. It provides an interface between the Protein
Sequence Database and the PDB and provides access to the PDB data via
computerized sequence searching and comparison methods. This release
corresponding to the January 1997 release of the Brookhaven Protein Data Bank
was produced using the facilities of the National Cancer Institute Biomedical
Supercomputer Center, Frederick, MD, and we gratefully acknowledge the
cooperation of J.V. Maizel, Jr., S.K. Burt, and G.W. Smythers.
4. Format changes anticipated for Release 54.00
We anticipate that there will be some format changes in Release 54. If you want
to receive advance notice of the changes, please E-mail a request for the
PIR Developer's Bulletin to PIRMAIL at NBRF.GEORGETOWN.EDU
5. Ordering the Atlas of Protein and Genomic Sequences CD-ROM
In addtion to the databases listed previously, the ATLAS CD-ROM also includes
an installation guide, an ATLAS User's Guide, the FASTA package, and an
Installation Manual and Tutorial for CarbBank. The ATLAS program, which
accesses all of the other data sets on the CD-ROM, does not access the
Complex Carbohydrate Structure Database.
Orders for the ATLAS CD-ROM are accepted, WITHOUT PREPAYMENT on institutional
purchase orders, by FAX or E-mail. For further information in the US and the
Americas, please contact:
Kathryn Sidman, Technical Services Coordinator
Protein Information Resource (PIR)
National Biomedical Research Foundation (NBRF)
3900 Reservoir Rd., NW
Washington DC 20007
FAX: (202) 687-1662
phone: (202) 687-2121
E-mail: PIRMAIL at nbrf.georgetown.edu
In Europe contact:
Martinsried Institute for Protein Sequences (MIPS)
Max-Planck-Institute for Biochemistry
8033 Martinsried, Germany
FAX: 49 89 8578 2655
phone: 49 89 8578 2657
E-mail: mewes at ehpmic.mips.biochem.mpg.de
In Asia and Oceania contact:
Japan International Protein Information Database (JIPID)
Science University of Tokyo
2669 Yamazaki, Noda 278 Japan
FAX: 81 47 122 1544
phone: 81 48 124 1501
E-mail: Tsugita at JPNSUT31.BITNET
For information about CarbBank contact:
CarbBank/CCSD
114 W. Magnolia St.
Suite 305
Bellingham, WA 98225
Phone: (360) 671-8134
EMail: CarbBank at PacificRim.net
PIR is a registered mark of NBRF
------------------------------------------------------------------------
Dr. John S. Garavelli
Associate Director
Protein Information Resource
National Biomedical Research Foundation
Washington, DC 20007
PIRMAIL at NBRF.GEORGETOWN.EDU
More information about the Bionews
mailing list