Why look at G+C content?

Keith Robison robison at nucleus.harvard.edu
Fri Mar 8 14:27:51 EST 1996


Mary K. Kuhner (mkkuhner at phylo.genetics.washington.edu) wrote:
: In article <4hmte8$674 at phelix.umd.edu> moths at Glue.umd.edu (Andrew Mitchell) writes:
: >As we are all aware, base composition biases can seriously affect 
: >phylogenetic analyses of DNA sequence data.   I have seen many papers in 
: >which such biases are assessed by examining the G+C content of 
: >sequences.  If this value is approximately 50% then authors conclude 
: >there is no base composition bias.  However, that 50% G+C could break 
: >down further into 45% G, 5% C, 10% A and 40% T - extreme composition 
: >bias.  So why the fixation with G+C content?  Is it simply a hangover 
: >from the days before DNA sequencing, or did I miss something?

: The genome as a whole, if it is base-paired, must have G=C and A=T,
: so the only mechanism that could produce G<>C would be one that was
: specific to the coding strand (i.e. the sequences you are looking at
: have more G than C; their complements on the non-coding strand have more
: C than G).  

: Most forms of mutation seem unlikely to know which strand is
: which, though I suppose a mutation mechanism related to transcription and
: thus acting on the transcribed strand only is possible.  (Something like
: "If you transcribe through a C it may turn to G" would eventually lead
: to a low proportion of C relative to G on the coding strand.)

A possible means to a strand-biased mutational preference could
arise from the manner of DNA replication.  Lagging-strand
synthesis occurs via a different mechanism than leading-strand
synthesis, and it is possible that the two synthesis modes have
differing mutational preferences.  

Keith Robison
Harvard University
Department of Molecular & Cellular Biology
Department of Genetics 

robison at mito.harvard.edu 






More information about the Mol-evol mailing list