Sequence/Structure Comparisons

Scott Presnell srp at cgl.ucsf.edu
Sun Sep 19 21:27:49 EST 1993


smb18 at mbfs.bio.cam.ac.uk (Simon Brocklehurst) writes:

>Barbara Moore writes:
>>"(1) How much better is a sequence/structure comparison than aligning
>>(possibly multiple) sequences to a known-structure sequence?
>>
>>The claim of the sequence/structure matchers is that they can catch
>>structural similarities between proteins whose sequences are very
>>different, and therefore a sequence alignment approach wouldn't work.
>>Most groups tend to give as examples one or a few such matches.
>>Examples: Bowie/Luthy/Eisenberg find such matches between CRP
>>and the cAMP-dependent protein kinase family, and between actin and
>>HSC70.  Jones/Taylor/Thornton find C-phycocyanin like globins.
>>

>  OK, can I clarify my question. How much better is a sequence/structure
>  comparison than doing the following:

>          1)    Take the sequence of the unknown
>          2)    Align all other sequences that can be detected
>                as being homologous
>          3)    Do a "state-of-the-art" secondary structure prediction
>                e.g. the Sander "PHD" approach
>          4)    See what folds are known that could fit the predicted
>                secondary structure


>  A simple (but not very good!) example should illustrate this:

>           Say we do a 2ndry structure prediction that gives
>           four helices.  If we look look all known structures
>           and find that the only time four helices occur is
>           in a 4-helix bundle then we can guess at the fold.


>   Any comments?  

Yes.  While the Eisenberg tools have seem to pull out some similarities, in
our use of the tools there are limits.  For instance, in the particular
case of four helix bundles, the Eisenberg profilesearch tool will not pull
out one four helix bundle based on the profile of another.  For instance
you cannot use hGH to profile search and recognize hIL-2, hIL-4, hEpo etc.
within the Z-score limits suggested in the paper.  In principle, we would
like to be able to find hIL-4 with hGH, they are very similar topologies,
but not very homologous sequences.

Further, in our use of the profile tools from the Eisenberg group, we were
not able to distinguish between left-handed (correct) and right-handed
(incorrect) forms of  hIL-4 model structures when compared to the real
structure (both seemed equally likely to represent the actual structure).

Given the example (4 helix bundles).  You might be better off trying the
four step method above, especially if you had a known homologue.  However,
given the recent progress in the a/b barrel recognition work by Eisenberg's
group, I would say why not try both and see what happens.

	- Scott

P.S.: While this strays from the subject I would point out that the 4)th
step listed above is the gotcha.  Guessing at the fold is fine, but that
doesn't lead to specific coordinates.  If you have a structure for a very
similar sequence, you're in fat city, but if you don't there's still work
to be done.  I guessed that hIL-4 was a four helix bundle, but that was the
easy part.  In the case of hIL-4, given a perfect prediction of secondary
structure and using 0th order constraining logic about the manner in which
helices interact, there are some 1.5 million ways to have three pairs of
helices form a bundle in a geometrically reasonable fashion.  After
applying known distance constraints and various other filters, about 1,500
structures are left, but it is still "an exercise to the reader" to figure
out which is correct.
 
-- 

	- Scott Presnell (srp at cgl.ucsf.edu)



More information about the Comp-bio mailing list