Why look at G+C content?

Nicolas Nicolas
Fri Mar 8 02:41:22 EST 1996


In article 674 at phelix.umd.edu, moths at Glue.umd.edu (Andrew Mitchell) writes:
> As we are all aware, base composition biases can seriously affect 
> phylogenetic analyses of DNA sequence data.   I have seen many papers in 
> which such biases are assessed by examining the G+C content of 
> sequences.  If this value is approximately 50% then authors conclude 
> there is no base composition bias.  However, that 50% G+C could break 
> down further into 45% G, 5% C, 10% A and 40% T - extreme composition 
> bias.  So why the fixation with G+C content?  Is it simply a hangover 
> from the days before DNA sequencing, or did I miss something?
> 
> Andrew Mitchell
> 


I agree with M. Kuhner and W.Gallin about Chargaff's rules. For a 
theoritical discussion on this subject, see papers of Lobry (J. Mol. Evol. 
and Mol. Biol. Evol. 1994-95). 
I can further answer as a pratician : in actual sequences, G% and C% are
highly correlated. Sometimes, you encounter a sequence with unusually high
A, C, G or T content, but most of the variability in base composition is
well decribed by GC%. That's why many (but not all) evolutionary models 
assume A=T, C=G in a given DNA strand : 3 compositional parameters can be 
seen as too high a cost to represent base composition.

Nicolas.




More information about the Mol-evol mailing list