protein data bank

Steven Brenner seb1005 at
Wed May 26 18:10:46 EST 1993

In article <1993May19.120301.14166 at Informatik.TU-Muenchen.DE>, Christoph.Niedermeier at (Christoph Niedermeier) writes:

> I would like to get lists of typical globular and membrane
> proteins which are available in the Brookhaven Protein data bank.
> Each protein considered should represent a large class of similiar
> proteins. I want to do statistics on some structural and electrostatic
> features e. g. on the ratio of polar and charged side chain groups.
> Large proteins (>250 residues) are preferred.

You might want to have a look at:

U. Hobohm, M. Scharf, R. Schneider, C. Sander, "Selection Of
Representative Protein Data Sets." _Protein Science_ 1:3 409-417
(1992).  HOBO9201090:Y.

They present a method which uses a simple graph-theory algorithm to
narrow down the database to a set of distinct proteins.  The methods
presented do have a number of problems, but is useful if you need a
list of individual representative proteins.

A list of current proteins in their representative set is available by
ftp; I believe the address is in the paper.


However, for collecting statistics, you most likely wish to "merge"
similar proteins together rather than eliminate all but one
representatives of a certain type of protein.  Otherwise, you end up
throwing away well-nigh 90% of your data!  

I have developed a sophisticated system for carrying out this sort of
weighting in the generation of statistics about protein structures.
If you would be interested in using it, please contact me at the
address below. 

Hope that this is of some help.


Steven E. Brenner               |  Internet    seb1005 at
Department of Biochemistry      |  JANET       seb1005 at
University of Cambridge         |  Laboratory  +44 223 333671
Tennis Court Road               |  Home        +44 223 314964
Cambridge CB2 1QW, UK           |  Lab Fax     +44 223 333345

More information about the Proteins mailing list