Why look at G+C content?

Daniel Weinreich dmw at MCZ.HARVARD.EDU
Mon Mar 11 12:33:50 EST 1996


The original poster on this thread (moths at Glue.umd.edu) is now having
problems with his news reader.  Probably because he liked my answer the
best ;-), he asked me to post this (slightly caustic) follow-up. 

Dan.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Daniel M. Weinreich			email: dmw at mcz.harvard.edu
Harvard University 			usmail: 26 Oxford Street
Museum of Comparative Zoology			Cambridge, MA 02138
voice: (617) 495-1954			fax: (617) 495-5846
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


--- Begin Forwarded Message ----------------------

No, I didn't miss Chargaff's rule.  It's rather elementary!  It's also
quite irrelevant in this case:  phylogenetic analyses are performed on
single strands, not double strands (where obviously the amount of G in
BOTH strands will equal the amount of C in BOTH strands).  You apparently
missed the point that in a SINGLE strand of DNA, the C-content is totally
independent of the G-content.  Why state the G+C content of a single
strand when it would be so much more informative to state the C-content
and G-content separately?   You are only losing information.   After
all, the only reason for examining base content in a phylogenetic analysis
is to assess whether the transformation probabilities between the character
states (A,C,G and T,  *not*  G-C and A-T) are about equal or not.

So, back to my original question: why do people state G+C content in a
phylogenetic analysis of DNA sequence data?  I suggested this could be a
hangover from the days before DNA sequencing, when the G+C content of double
strands could be determined by other means.  Some people continue to
report base composition as % G+C, but seeing as single stranded DNA does not
undergo Watson-Crick base-pairing to itself, the G+C and A+T partition is
arbitrary.  One might as well have chosen to report G+A content or G+T
content, but there is less information in any one of these partitions
than in the individual base frequencies themselves.

Andrew Mitchell
Department of Entomology
University of Maryland





More information about the Mol-evol mailing list