Protein folding & computational biology

Louis Hom lhom at OCF.Berkeley.EDU
Wed Jan 10 12:00:30 EST 2001

In article <3A5BC1BB.4663055 at>, bjhenry1  <bjhenry1 at> wrote:
>Some of the more well known ones include IBM's "Blue Gene" project and
>the recent "Folding at home" project.  I also found there's another lesser
>known project called "Folderol".
>From what I've read so far (about the former two at least), is that they
>are distinguishing themselves (aside from one using massive parallelism
>vs the other using distributed computing) on how much "resolution" they
>plan to use to carry out their simulations...

I don't currently run Folding at Home (even though I was asking about it back 
in 11/99) since I'm on a Mac, but I think I can answer this and some of 
the other questions in the thread . . .

By "resolution", I think they're talking about the level of structural 
detail that they're going after (i.e., "resolution" as it's used in 
microscopy and crystallography).  At low resolution, you get to see the 
general form of the protein (lobes, cavities, etc.), at higher resolution 
you see the backbone of the protein and how it's shaped into helices and 
strands, and at even higher resolution you can see the relative positions 
of the amino acid side chains.

Why work on protein folding (beyond the fact that it's a fascinating
challenge)?  The short answer is that it may allow us to engineer and 
design useful proteins and ligands (small molecules that specifically bind 
particular proteins) for medical and industrial purposes.

The longer answer:  It has been shown over and over that most proteins can
be completely unfolded (e.g., by changing pH) and can refold all by
themselves into a single distinct structure when conditions are returned
to normal.  This means that all of the information on how to fold that
particular protein is contained in its amino acid sequence;  no outside
information or assistance is required.  But our current ability to
understand that "information" is limited by our ability to test our

In a given protein structure (either folded or misfolded) there are many
interactions between the amino acids and between the amino acids and the
solvent.  Physics tells us which interactions are favorable and which are
unfavorable, as well as how strong these interactions are.  There are also
energetic contributions from the solvent and how it's structured around
the protein.  These are fairly basic calculations, but there are a lot of
them -- a lot of them for each possible structure (folded or misfolded)  
that we want to assess.  And if you are start with a random coil of your
protein and you want to figure out its final structure, you're going to 
have to consider a lot of different structures (right and wrong) along the 

Historically, people have tried to reduce the number of calculations
involved in order to make the data more manageable, for example by 
grouping atoms together to behave as a single body.  There are lots of 
researchers, using different strategies and operating under slightly 
different assumptions.  And so far, nobody has figured out which data are 
critical, and which ones are less important, to the point of being able to 
take an amino acid sequence and precisely predict the three dimensional 
structure of the protein.

By increasing the computing/calculating power, it's possible to more 
precisely assess a greater number of potential structures, hopefully 
increasing the likelihood that you'll correctly predict the final 
structure.  And then you can work on streamlining your folding algorithm 
-- seeing what's most important to consider, what's less important -- 
which in itself will likely be a very informative process.  

It's not that we can't currently determine the structures of proteins --
crystallography and NMR and sometimes mass spec allow us to do that pretty
well already.  But we can't _predict_ structures from amino acid
sequences, and with all of the genomic data surfacing, it will become an
increasingly important capability.  The current methods of structure
determination take a lot of time.  Structure prediction can help us
understand how the proteins interact with each other, how they act on
small molecules, how they might be modified to serve some other function
or in some other environment, and how their function might be blocked.
Lou Hom >K'93			     
lhom at 	    

More information about the Proteins mailing list