# Coordinates accuracy

Zbyszek Otwinowski zbyszek at mix.swmed.edu
Sun Mar 2 21:50:53 EST 1997

```Kay Diederichs wrote:
>
> Vincent Escalier (Vincent.Escalier at snv.jussieu.fr) wrote:
> : Hello,
> :
> :  I'd like to have a measure of accuracy of the coordinates in a PDB
> : file. I need a quantitative measure, and resolution doesn't make it.
> : I would need something like 'the x-coordinate of atom #1234 is
> : equal to 0.2345 +- 0.05 A '. Is it possible to have such an
> : information ?
> :
>
> a rule of thumb combining resolution, refinement
> success and individual temperature factors might be:
>
> the estimated coordinate error [in Angstroem] can be approximated by
> 0.1*(R_free/20)*resolution*sqrt(B/20)
>
> where R_free is in percent, resolution in Angstroem and B denotes the
> individual temperature factor of the atom under consideration (from the
> PDB file).
>
> The above formula has to be taken with many grains of salt - e.g. it requires
> that the structure is correct (there are examples of wrong structures in the
> PDB). So don't take the resulting numbers too seriously! Rules of thumb can
> be quite misleading (but may be justified in some cases).
>

I would like to comment that coordinate error has (at least) seven
meanings in protein crystallography.

1) rms (root mean square) deviation of atom position in the set of
structures present in crystal lattice.

2.) error (uncertainty) of atomic position due to data not being
measured precisely. Standard error propagation rules can be used
to estimate this error, however, es this calculation requires
full matrix refinement, it is rarely done.

3.) error (uncertainty) of atomic position due to the atomic model
not being exactly correct. Current atomic models do not describe
well solvent structure, nor conformational variability. Due to
non-linear character of refinement, error in the model (even in other
parts of the molecule than one which is being investigated) produces
error in the atom coordinates.

4.) sum of 1 and 2

5.) sum of 2 and 3

6.) sum of 1 and 3

7.) sum of 1 and 2 and 3

The problem with any formula is that it may not be clear to which
one of seven posibilities it referes to.

Depends on a nature of problem being studied the relevant error
is one of the seven mentioned above.

The restrained nature of protein refinement makes error estimate
very sensitive to restrains being used. PDB data are generated
with widely different restrains. Typicaly temperature factors
are being restrained. The sqrt(B) factor is applicable only in
the case when temperature factors are not restrained.

> : I would need something like 'the x-coordinate of atom #1234 is
> : equal to 0.2345 +- 0.05 A '. Is it possible to have such an
> : information ?

I would say that one can only make wery rough estimate of coordinate
error. In particular, I would avoid any inferences based
on precise values of coordinate error. Either structural analysis is
robust
(not dependent on error model) or it has very little validity.

> : I'd like to have a measure of accuracy of the coordinates in a PDB
> : file. I need a quantitative measure, and resolution doesn't make it.

One should also be aware that data deposited in PDB are of very
different
quality. Be careful with any conclusions based on quantitative measure
of atomic accuracy in PSB.

--
Zbyszek Otwinowski                     |        zbyszek at chop.swmed.edu
University of Texas                    |        tel : (214)-648-5098
Southwestern Medical Center            |        fax : (214)-648-5095
5323 Harry Hines Boulevard  Dallas, Texas  75235-9038
[EOB]

```