Protein sequence saturation
btf at t10.lanl.gov
Tue Mar 31 14:28:19 EST 1998
> Dear Brian,
> Thank you for your message.
> I was talking about saturations in terms of observed differences between
> protein sequences. For example, plotting observed percent differences
> (identities or similarities?) vs evolutionary distances one can see a
> plateau at high values of distances (over 250 PAMs). This is the way I
> know to assess saturation between sequences. I would like to know about
> other methods.
With protein sequences alone, there should be no way to
tell. A protein such as Elongation Factor 2, (which is absolutely
critical for cell survival and interacts with many ribosomal
proteins and other proteins) has a very high degree of sequence
conservation due to many site being critical for function. Thus
this protein might be 80 or 90% identical between two species,
even though their last common ancestor was hunrededs of millions
of years ago. Other proteins are more free to change
their amino acid sequence.
With the DNA sequence, we can look at silent site
mutations and see if they have been saturated. It must
be kept in mind that evan at full staturation, there may be
much more than 25% of the sites identical. Factors which
can cause this are such things as cocon preference, conservation
of CG:AT ratio, mRNA secondary structure, elimination of CpG
The synonymous:nonsynonymous mutation ratio can be used
as a measure of selective pressure on a protein.
With only protein sequences, one might guestimate if
the sequences shared a common ancestor so long ago that all
codons have been hit with mutations many times, by looking
for regions that are no longer similar in sequence. Even if
some regions of the protein are 80 or 90% identical due to
selective pressure to maintain the sequence, there may be other
regions that are free to evolve. One must be careful that
recombination, bringing in an exon from an unrelated gene could
cause the same thing.
More information about the Mol-evol