Why look at G+C content?

Daniel Weinreich dmw at MCZ.HARVARD.EDU
Fri Mar 8 08:02:15 EST 1996

mkkuhner at genetics.washington.edu, wgallin at gpu.srv.ualberta.ca and
galtier at acnuc.univ-lyon1.fr all wrote approximately:

> In article <4hmte8$674 at phelix.umd.edu> moths at Glue.umd.edu (Andrew Mitchell) writes:
> >As we are all aware, base composition biases can seriously affect 
> >phylogenetic analyses of DNA sequence data.   I have seen many papers in 
> >which such biases are assessed by examining the G+C content of 
> >sequences.  If this value is approximately 50% then authors conclude 
> >there is no base composition bias.  However, that 50% G+C could break 
> >down further into 45% G, 5% C, 10% A and 40% T - extreme composition 
> >bias.  So why the fixation with G+C content?  Is it simply a hangover 
> >from the days before DNA sequencing, or did I miss something?
> Chargaff's corollary to Watson-Crick base pairing requires that G=C and 
> A=T.

Dear Andrew,

I'm with you and must respectfully disagree with Mary, Warren and Nicolas. 
Certainly genomic G=C and A=T so long as W-C base pairing obtains.  But
our analyses focus on only one strand (generally the coding strand, though
except for the translation to amino acids, that's arbitrary), whose
A/C/G/T composition can in principle be anything. 

Consider primate mtDNA.  For some reason, protein-coding sequences have
roughly equal A and C content, much reduced T content, and nearly no G's. 
For example, among 6 mtDNA genes from 5 primates, mean percent
compositions are A = 37%, C = 39%, T = 19% and G = 5%.

And I think your original point is valid: most of our favorite estimation
programs (be they for phylogeny or substitution rate estimation) are quite
sensitive to underlying base frequencies ON ONE STRAND.  I believe that's
the point of, for example, Kondo et al, JME 36:517 and Perna and Kocher,
MBE 12:359.

Or maybe I missed something!

Daniel M. Weinreich			email: dmw at mcz.harvard.edu
Harvard University 			usmail: 26 Oxford Street
Museum of Comparative Zoology			Cambridge, MA 02138
voice: (617) 495-1954			fax: (617) 495-5846

More information about the Mol-evol mailing list