James,
Peter Foster has a recent paper in JME showing that analyses of protein
sequences can be affected by base composition biases in the DNA sequence.
I believe he now works in the same building as you (or does he only start
in October?). Speak to him!!!
Andrew
=================================================================
\ /
Andrew Mitchell _____ \ / _____
Department of Biological Sciences / `)_ O^O _(` \
CW-405 Biological Sciences Building / \( = )/ \
University of Alberta ( ( = ) )
Edmonton, T6G 2E9 <---------//_=_\\--------->
Canada \ / |___| \ /
\ / |___| \ /
Phone: (780) 492-0587 *___~ |___| ~___*
Fax: (780) 492-9234 \_/
E-mail: am16 at ualberta.ca U
=================================================================
James McInerney <james.o.mcinerney at may.ie> wrote:
> Dear all,
> Traditional dogma suggests that we should use protein sequences for
> inferring relationships from molecular sequences in those instances when
> the underlying DNA sequences might be suffering from convergence due to
> mutational bias.
> The suggestion being that protein sequences suffer very little from
> compositional convergence. I am wondering how true this is. If we
> think about the classification of amino acids (aromatic, small polar
> etc.) then there are only a limited number of _allowable_ substitutions
> at any one site (I am of course using this term _allowable_ in a loose
> way). In other words, the substitution space for a particular amino
> acid is much smaller than 19 (20 including indels) other character states.
> So, what about convergence in protein-coding sequences? Is it rampant?
> Is it as extensive as (for instance) thermophilic convergence in
> ribosomal RNA sequences?
> In reality, if an aromatic amino acid is needed at a particular
> location, then the replacement of phenylalanine by tryptophan or
> tyrosine is much more likely and also the existence of homoplastic
> changes for this site is probably more likely than at the nucleotide
> level when there are four alternatives, rather than (_effectively_) two!
> So, stepping off my soapbox for a second, does anybody agree with this
> comment, or is it completely wrong? I have inferred amino acid
> compositional trees and often it is possible to generate very different
> trees on the basis of composition and on the basis of, say parsimony or
> likelihood analysis of the characters. So there are homoplastic amino
> acid compositional changes, it does exist. But, does it affect
> phylogeny reconstruction?
> Do we have any good studies of amino acid compositional convergence?
> Protein similarity that is not due to recentness of common ancestry, but
> rather due to compositional convergence (or parallelism, or reversal or
> any homoplastic event you like to name)?
> Any input is gratefully received.
> James
> --
> Dr. James O. McInerney,
> Dept. Biology, Dept. Zoology,
> Natl. Univ. Ireland, The Natural History Museum,
> Maynooth, and Cromwell road,
> Co. Kildare, Ireland London SW7 5BD, UK.
> Phone +353 1 708 3860 +44 171 938 9163
> Fax +353 1 708 3845 +44 171 938 9158
> email james.o.mcinerney at may.iej.mcinerney at nhm.ac.uk>http://www.may.ie/academic/biology/jmbioinformatics.shtml> ---