Protein composition convergence

A Mitchell am16 at gpu.srv.ualberta.ca
Wed Sep 22 19:59:11 EST 1999


Peter Foster has a recent paper in JME showing that analyses of protein
sequences can be affected by base composition biases in the DNA sequence.
I believe he now works in the same building as you (or does he only start
in October?).  Speak to him!!!



                                                \   /
Andrew Mitchell                        _____     \ /     _____
Department of Biological Sciences     /     `)_  O^O  _(`     \
CW-405 Biological Sciences Building  /         \( = )/         \
University of Alberta               (           ( = )           )
Edmonton, T6G 2E9                    <---------//_=_\\--------->
Canada                                \       / |___| \       /
                                       \     /  |___|  \     /
Phone: (780) 492-0587                   *___~   |___|   ~___*
Fax: (780) 492-9234                              \_/
E-mail: am16 at ualberta.ca                          U


James McInerney <james.o.mcinerney at may.ie> wrote:
> Dear all,

> Traditional dogma suggests that we should use protein sequences for
> inferring relationships from molecular sequences in those instances when
> the underlying DNA sequences might be suffering from convergence due to
> mutational bias.

> The suggestion being that protein sequences suffer very little from
> compositional convergence.  I am wondering how true this is.  If we
> think about the classification of amino acids (aromatic, small polar
> etc.) then there are only a limited number of _allowable_ substitutions
> at any one site (I am of course using this term _allowable_ in a loose
> way).  In other words, the substitution space for a particular amino
> acid is much smaller than 19 (20 including indels) other character states.

> So, what about convergence in protein-coding sequences?  Is it rampant? 
> Is it as extensive as (for instance) thermophilic convergence in
> ribosomal RNA sequences?

> In reality, if an aromatic amino acid is needed at a particular
> location, then the replacement of phenylalanine by tryptophan or
> tyrosine is much more likely and also the existence of homoplastic
> changes for this site is probably more likely than at the nucleotide
> level when there are four alternatives, rather than (_effectively_) two!

> So, stepping off my soapbox for a second, does anybody agree with this
> comment, or is it completely wrong?  I have inferred amino acid
> compositional trees and often it is possible to generate very different
> trees on the basis of composition and on the basis of, say parsimony or
> likelihood analysis of the characters.  So there are homoplastic amino
> acid compositional changes, it does exist.  But, does it affect
> phylogeny reconstruction?

> Do we have any good studies of amino acid compositional convergence? 
> Protein similarity that is not due to recentness of common ancestry, but
> rather due to compositional convergence (or parallelism, or reversal or
> any homoplastic event you like to name)?

> Any input is gratefully received.

> James

> -- 
>              Dr. James O. McInerney,
> Dept. Biology,                       Dept. Zoology,
> Natl. Univ. Ireland,                 The Natural History Museum,
> Maynooth,                  and       Cromwell road,
> Co. Kildare, Ireland                 London SW7 5BD, UK.
> Phone +353 1 708 3860                +44 171 938 9163
> Fax   +353 1 708 3845                +44 171 938 9158
> email james.o.mcinerney at may.ie       j.mcinerney at nhm.ac.uk
> http://www.may.ie/academic/biology/jmbioinformatics.shtml
> ---

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net