protein data bank
seb1005 at bio.cam.ac.uk
Wed May 26 18:10:46 EST 1993
In article <1993May19.120301.14166 at Informatik.TU-Muenchen.DE>, Christoph.Niedermeier at hegel.physik.uni-muenchen.de (Christoph Niedermeier) writes:
> I would like to get lists of typical globular and membrane
> proteins which are available in the Brookhaven Protein data bank.
> Each protein considered should represent a large class of similiar
> proteins. I want to do statistics on some structural and electrostatic
> features e. g. on the ratio of polar and charged side chain groups.
> Large proteins (>250 residues) are preferred.
You might want to have a look at:
U. Hobohm, M. Scharf, R. Schneider, C. Sander, "Selection Of
Representative Protein Data Sets." _Protein Science_ 1:3 409-417
They present a method which uses a simple graph-theory algorithm to
narrow down the database to a set of distinct proteins. The methods
presented do have a number of problems, but is useful if you need a
list of individual representative proteins.
A list of current proteins in their representative set is available by
ftp; I believe the address is in the paper.
However, for collecting statistics, you most likely wish to "merge"
similar proteins together rather than eliminate all but one
representatives of a certain type of protein. Otherwise, you end up
throwing away well-nigh 90% of your data!
I have developed a sophisticated system for carrying out this sort of
weighting in the generation of statistics about protein structures.
If you would be interested in using it, please contact me at the
Hope that this is of some help.
Steven E. Brenner | Internet seb1005 at mbfs.bio.cam.ac.uk
Department of Biochemistry | JANET seb1005 at uk.ac.cam.bio.mbfs
University of Cambridge | Laboratory +44 223 333671
Tennis Court Road | Home +44 223 314964
Cambridge CB2 1QW, UK | Lab Fax +44 223 333345
More information about the Proteins