David Maddison wrote:
> So has anyone written about possible methods of coping with
> such problems?
As far as I know, there is *NO* extant formal algorithm
for distinguishing monophyletic from polyphyletic groups
of highly-divergent-with-trace-similarity protein sequences.
A closely related problem is that of telling whether motifs
in disparate proteins are all from a common progenitor or
evolved convergently: this, also, has no formal algorithm
that I know of.
Here are some things you might try, though:
see if the proteins have similar hydrophobic profiles
in the regions that you think are conserved
see if the proteins are predicted by the PHD neural
network to have similar secondary structions in [blah blah]
try doing multiple alignment with MACAW, which has
very powerful methods of detecting regions of similarity (by
Gibbs sampling) and a really good color graphic interface that
allows intuitive assesment of which regions of an alignment
are good, and which are poor
Comparing hydrophobic profiles was published by Sweet
and Eisenberg in PNAS around 1984; PHD and MACAW were
described in multiple papers over the last few years.
If you'd like more details of these, e-mail me at
schwarz at seqaxp.caltech.edu
Good luck!
--Erich Schwarz