PDB atom names no longer begin with atom type!
Mark Israel
misrael at scripps.edu
Tue Nov 22 19:41:17 EST 1994
It seems that the Protein Data Bank is "evolving" the PDB format so
that the atom name field no longer necessarily begins with the atom
type. It seems to me that this change is going to break a lot of
software, and not be very beneficial. What do others think?
Below is some correspondence on the subject (posted with the authors'
permission). I would encourage people with comments to send them
directly to the Protein Data Bank (pdb at chm.chm.bnl.gov).
misrael at scripps.edu Mark Israel
-----------------------------------------------------------------------
Subject: NAD atom naming
From: Dave Schuller <schuller at indigo2.biomol.uci.edu>
To: pdb at chm.chm.bnl.gov
Date: Mon, 14 Nov 1994 12:35:40 -0800 (PST)
Message-Id: <Pine.3.89.9411141227.C28376-0100000 at indigo2.biomol.uci.edu>
Dear Protein Data Bank:
I shall soon be submitting coordinates for a protein structure which
includes an NAD (Nicotinamide Adenine Dinucleotide) cofactor. I wish to
raise the issue of the naming of atoms in this molecule.
I have a copy of the Protein Data Bank Atomic Coordinate and
Bibliographic Entry Format Description. The cover date says February
1992. This document lays out specifications for all PDB format records,
and atom names for all common residues and cofactors, including NAD.
The atom names for NAD, on p. 35, are in conflict with the stated rules,
and out of line with all other listed entries. Specifically, the first
two characters of the atom name are supposed to comprise the element
name, right justified. Instead, the displayed names for atoms in the
nicotinamide half of NAD begin with N, and in the Adenine half of the
molecule with A. The element names comprise the second letter of the atom
names.
This means programs which expect to find the element type in columns
13-14 are parsing the NAD as being half nitrogen, and half actinium or
unknown. As you might imagine, this causes problems both in bonding
radius, and in mass for FFT calculations. I know of at least two popular
molecular graphics programs, Setor and Xtalview, which have problems of
this nature.
In practice, I have seen several different atom naming schemes for NAD. I
believe the lack of a good, usable standard has contributed to this.
Would it be possible for the PDB to come up with a better atom naming
scheme for NAD which is consistent with its own guidelines?
thank you,
Dave Schuller
schuller at indigo2.biomol.uci.edu
Dept. of Molecular Biology
University of California - Irvine
-----------------------------------------------------------------------
Subject: Protein Data Bank atom naming
From: Arthur Forman <forman at raptor.pdb.bnl.gov>
To: schuller at indigo2.biomol.uci.edu
Date: Thu, 17 Nov 1994 09:05:44 -0500
Dear Dr Schuller,
I have to agree that the atom naming rules are not always
strictly obeyed, and in NAD for example there are atom names such as AN3.
The allotted field space is confining especially with larger molecules,
so we at least ensure that if the 1st field is used for an element with only 1
character for its symbol that the resulting 2-character combination is not
a symbol for any element. Thus there is no element "AN" and the 2nd character,
"N" denotes nitrogen. The "AC" should mean there is no element Ac, and the
element is carbon (of adenine); this was an oversight. I am not aware of any
other violations of this type and fortunately we do not as yet have any HET
groups with actinium. With the large number of hydrogens possible in some
molecules (we do get these) we do need to make use of the first field. Once
the atoms are named, the HETATM description for a particular group is uniform
across the whole data set.
We are in the process of updating our format description and
changes we make will have to be applied to existing as well as new entries.
Needless to say these will affect other existing software packages so we are
proceeding slowly. At present we do have to remain consistent with what we
already have; but your point is well taken.
Yours sincerely,
Arthur Forman
-----------------------------------------------------------------------
More information about the Proteins
mailing list