Announcements of PIR Network Request Service
POSTMASTER at NBRF.Georgetown.Edu
POSTMASTER at NBRF.Georgetown.Edu
Tue Apr 13 14:51:33 EST 1993
Announcements of the Protein Information Resource
Network Request Service
Highlights
1. Summaries for PIR-International Release 36, NRL_3D Release 12, ALN Release 4
2. Summary of Database Developments in Release 36.00
3. Second Technical Development Bulletin Available
4. Answer to an FAQ and Changes in PIR Network Request Server Commands
5. PIR Network Request Service Command Summary
Announcements
1. Summaries for PIR-International Release 36, NRL_3D Release 12, ALN Release 4
Release 36.00 of the PIR-International database, Release 12.00 of the NRL_3D
database (corresponding to Brookhaven Protein Data Bank Release 63), and
Release 4.00 of the ALN database of protein sequence alignments are now
available through the PIR On-line system and the Network Request Server.
The PIR1, PIR2, PIR3 and NRL_3D databases have been distributed on tape;
the CD-ROM with those databases and ALN is in production. An appropriate
announcement will be made when the CD-ROM is distributed.
Database Release Sequences Residues
PIR1 36.00 11,252 3,903,802 Classified and Annotated Entries
PIR2 36.00 27,383 7,560,531 Annotated Entries
PIR3 36.00 13,622 4,021,433 Unverified Entries
NRL_3D 12.00 1,630 283,635 Sequences in Brookhaven PDB
ALN 4.00 956 (Entries) Protein Sequence Alignments
Growth of the PIR databases is documented in the file DBGROWTH.LIS available
through the Network Request Server. The following files are also available
through the Server:
PADD.LIS PIR1 entries added since Release 35.00
PREV.LIS PIR1 entries with revised sequences since Release 35.00
SUPERFAM.LIS superfamiles recorded in PIR1 and PIR2
KEYWORDS.LIS keywords employed in PIR1 and PIR2
FEATURES.LIS features cataloged in PIR1 and PIR2
JOURNALS.LIS recognized journal abbreviations
ALNBASE.LIS a description of the ALN database
ALNTITLE.LIS titles in the ALN database
NRLTITLE.LIS titles in the NRL_3D Database
To obtain these and other files from the PIR Network Request Server, follow the
instructions in the last section of these announcements.
2. Summary of Database Developments in Release 36.00
The three sections, PIR1, PIR2 and PIR3, of the PIR-International Database now
have a uniform format description. Previously some PIR3 entries appeared with
an asterisk in the title and included the comment
*This entry is not verified.
This has been resolved for each entry, or removed and replaced with a new
"Status:" comment. The comment
Status: preliminary
indicates that the sequence and reference information has been verified or
extracted automatically from another database. However, the entry may not
have been reviewed subsequently by a PIR-International staff scientist.
Information extracted from the NCBI Backbone Sequence Database is included in
the PIR-International Database. Initially, some information extracted from
the NCBI data set may not conform to previous PIR standards or conventions.
The database codes "NCBIN:" and "NCBIP:" now appear indicating, respectively,
a cross-reference to a nucleic sequence and a cross-reference to a protein
sequence (or conceptual translation) from the NCBI Backbone Database. Such
cross-references are followed by the comment
Note: sequence extracted from NCBI backbone
In entries extracted from other databases the comment "Status: preliminary"
additionally indicates that the sequence has not been checked by
PIR-International personnel.
The new molecule type "nucleic acid" has been introduced for those entries
where the molecule type could not be determined.
As a result of collaborations with Human Genome Data Base (GDB) Center at the
Johns Hopkins University Welch Medical Library, the human sequence entries in
the PIR-International Database have gene names cross-referenced to GDB gene
symbols. In the "Gene name" information the database code "GDB:" before a gene
symbol indicates this cross-reference. Human entries with gene names not
preceded by "GDB:" and those without gene names will be matched in an on-going
joint effort with the Human Genome Data Base.
As a result of collaborations with the National Center for Biotechnology
Information (NCBI), the bibliographic references in the PIR-International
Database have been extensively cross-referenced with the National Library of
Medicine MedLine UID's. In the "Reference number" information the database
code "MUID:" followed by a reference number indicates the MedLine UID.
3. Second Technical Development Bulletin Available
The second PIR-International Technical Development Bulletin is available in
the file PIRTECH.LIS that can be sent by the PIR Network Request Server or
picked up by anonymous FTP from the UH Gene-Server, ftp.bchs.uh.edu, IP address
129.7.2.43. This electronic bulletin provides detailed specifications of the
database format and serves as an "early warning system" for software developers
and others who are concerned about changes in the format and standards for the
PIR databases. If you are interested in the technical aspects of these
database changes and would like to be placed on the mailing list for the
Technical Bulletin, send a brief electronic mail note to POSTMAST at GUNBRF on
BITNET or to POSTMASTER at NBRF.Georgetown.Edu on Internet.
4. Answer to an FAQ and Changes in PIR Network Request Server Commands
One frequently asked question goes something like
> What is the longest (or shortest) known human (or some other species)
> protein sequence?
Fragments, free-amino acids and isopeptides should be eliminated from the
contest for shortest. Then there would have to be a caveat that it might
not be certain whether a particular di- or tripeptide is genetically coded.
Also, for various reasons there is an inherent bias that may limit the
shortest to 3 rather than 2 residues. The shortest human sequence in the
PIR databases is:
length code title
3 GKHU Growth-modulating peptide - Human
The current longest sequences for various eukaryotes are:
length code title
6805 S20901 Titin - Rabbit
6048 S07571 Twitchin - Caenorhabditis elegans
5147 A41087 Cadherin-related tumor suppressor precursor - Fruit fly
(Drosophila melanogaster)
5032 A35041 Ryanodine receptor - Human
The PIR Network Request Server will now allow the PIR protein sequence
databases to be queried on the basis of length. The commands
USE LOWER nnn
and
USE UPPER nnn
will set the sequence length lower and upper limits. For example, the
commands
USE LOWER 1300
USE UPPER 1600
will restrict the selection to sequences with from 1300 through 1600 residues.
The default unrestricted limits can be restored by using the commands
USE LOWER *
USE UPPER *
The USE LOWER, USE UPPER, USE BEFORE, USE AFTER and USE FORMAT commands
are applicable only to the PIR1, PIR2, PIR3 and NRL_3D databases; these
commands cannot be used with the ALN, GenBank and EMBL databases.
It is now anticipated that with Release 37.00, the "Host" information will
be eliminated in the PIR databases. When this happens, the HOST command for
the PIR Network Request Server will be disabled.
5. PIR Network Request Server Command Summary
The National Biomedical Research Foundation Protein Information Resource
Network Request Server is a full-function fileserver and database query system.
Operating since August 1990 it is capable of handling database queries,
sequence searches and sequence submissions, in addition to fileserver requests.
To use this server, request commands should be sent to
FILESERV at GUNBRF on BITNET or
FILESERV at NBRF.Georgetown.EDU on Internet.
The server recognizes the following commands sent either in a mail message
or (if the sender is on BITNET) in a command message or a file:
Command Action
------- -----------------------------------------------
ACCESSION list entry codes and titles by accession number
AND combine QUERY commands with Boolean AND
AUTHOR list entry codes and titles by author
BASES list accessible databases
CROSS list PIR entry codes and titles corresponding to
a particular nucleic sequence database entry
DEPOSIT deposit entry for database submission
END DEPOSIT terminate deposit entry
FEATURE list entry codes and titles by feature table entry
GENE list entry codes and titles for a gene name
GET return entry by entry code
HELP return HELP instructions
HOST list entry codes and titles by host species
INDEX list SENDable files
JOURNAL list entry codes and titles by journal citation
KEYWORD list entry codes and titles by keyword
MEMBER list alignments containing entry c
More information about the Bionews
mailing list