I'm interested in improving empirically-based amino acid
substitution model for use in phylogenetic analysis.
One method which may improve upon existing models (like
the Dayhoff models which programs like PROTDIST and
PROTML use) is to determine whether the propensity
of an amino acid (say X) to change to another amino acid
(Y) depends upon the structural context in which it is
found. For instance, it may be the case that the
transition X<-->Y is very frequent in a secondary
structural element like an alpha helix, but is very
rare in a beta-sheet or a beta-turn. Similarly
a hydrophobic amino acid (lets call it Z) may
frequently change to another hydrophobic amino acid (say Q (
keep in mind that I'm not using the one-letter amino acid code))
in a core region of a protein while the change to a charged
or "hydrophilic" amino acid in this region may be very rare.
The dynamics of change are likely going to be very different
on a solvent accessible region of a protein.
Thus it is possible that "averaged" substitution models
like that of Dayhoff, may be very poor approximations of
the actual site by site frequencies simply because they
are averages of many dissimilar models.
If what I say is possibly true, it should be of interest
to develop structure-based amino acid substitution matrices
for a whole variety of proteins for which one of the homologues
has been crystallized (making the assumption that the structures
have not changed too much). Doing this may allow us to develop
general substitution matrices for structural elements and to
test whether the different matrices are significantly
different from one another (this may in turn help those
interested in protein engineering) and in what way.
The application of these models to protein phylogeny would
only require the user, to provide structural information in addition
to an alignment.
Is someone already doing this? Does anyone know of references
to any such investigation?
I'd be interested to hear what those involved in development of
phylogenetic methods have to say about this idea
Cheers
Andrew Roger
Dept. of Biochemistry
Halifax, N.S.
B3H 4H7
aroger at ac.dal.ca