Refs for CUBING..... (long reply)

Phil Jeffrey phil at
Mon Apr 3 17:39:38 EST 1995

In article <3lhmdt$ji5 at> dcohen at (Dawn Cohen) writes:

>> Does anyone have pointers to an algorithm known as "CUBING", which I
>> am told is good for calculating distances?

[since nobody else has posted a response in nearly a week, here's by 2c]

Don't have any references about it, but here's some info on the type of
algorithm used.  Basically it's just hashing in 3-D.

The easiest (not really cubing) method is simply to sort your coordinates on
X,Y and Z.  Then for each target atom, you can determine an upper and lower
bound for the X,Y,Z of any contacting atom since atomic contacts are always
less than 5 Angstrom (say).  So you would only need to calculate the accurate
atomic distances for atoms that lie in the range X +/- 5A, Y +/- 5A, Z +/- 5A.
Although you do a floating point difference test on each atomic coordinate,
this does cut down the CPU somewhat.

Full-blown cubing is a refinement of that, where each atom is given an integer
key for each direction, computed by something like:
        ncubex = NINT( (x-xcen)/xcube )
where xcen and xcube are set to something reasonable (CofG of molecule, 5 Ang.
or so cube side).  So each atom has 3 integers describing it's position. Sort
into NX,NY,NZ order, store the indices for the start of each new cube, and
stops, and you have a method to scan the fewest number of atoms possible for
distance checking.  If your target atom has coords (X,Y,Z) and cube coords of
(NX,NY,NZ), the you would search the cubes with values of 
(NX +/- 1, NY +/- 1, NZ +/- 1).  If you kept a record of exactly where each
cube starts and stops, you search over the minimum number of atoms,
corresponding to 27 cubes'-worth of atoms.  Cube size should be at least the
maximum contact distance you want to consider, or larger, depending on the
number of cubes you want to maintain.

>>   Also, are there any pointers on algorithms commonly used for
>>   calculating contacts or distances between sets of atoms.  (e.g.,
>>   given a crystal structure, calculate all the contacts between
>>   solvents and protein, that are between 2.6 and 3.2 A.)

The algorithm is very simple, and my implementation of it doesn't even use
cubing since it runs in a few minutes on an SGI.  The choice of algorithm
seems to vary a lot: some people use an arbitrary cutoff of 4.0 or 4.5 Angstrom,
but I use the method of Sheriff/Hendrickson/Smith (1987) J. Mol. Biol. 197,
273-296; where they define a contact as lying between 0.9-1.11 times the
sum of the atomic VdeW radii.  However, I use extended atom radii, since these
seem more appropriate than the atomic radii used in the original method.  This
slight modification on the original method is mentioned in Sheriff (1993)
Immunomethods 3, 191; but I don't recall anything radical in the way of new
methods employed in the latter paper. Just the new radii.

I use the method because it makes more sense to me than an arbitrary cutoff,
but the upper limit is a grey area for contacts (and hydrogen bonds) so I
wouldn't claim to have the One True Answer on this.  I can, however, supply
a reasonably dumb program for this purpose that runs in an acceptably short
time for a single calculation, given two PDB files. The CCP4 program DISTANG 
will probably do a better job, but I'm not aware of the algorithm that it

*** if anybody has any insights as to better ways to calculate contacts I'd
*** like to hear them, either on here or via email (phil at


Phil Jeffrey

| Phil Jeffrey                                  |                             |
| X-ray/Computer Manager, Crystallography Lab   | If you lie to the compiler, |
| Memorial Sloan-Kettering Cancer Center, NYC   | it will get its revenge     |
| phil at, p-jeffrey at |     - Henry Spencer         |
| Ph: (212) 639 2189   Fax: (212) 717 3066      |                             |

More information about the Xtal-log mailing list