CpG islands and rodents.

Brian Foley brianf at med.uvm.edu
Sun Jan 23 17:51:13 EST 1994

	In the December 1993 Issue of Proceedings of the National Academy
of Sciences, Vol 90 pp 11995-11999, F. Antequera and A. Bird report on
the loss of a CpG island in the mouse zeta-globin gene relative
to the human zeta-globin gene.
	I could not beleive the data, so I compared the human, 
goat, horse and mouse zeta globin genes.  It is true, the mouse
has lost a GC-rich region of the second intron and replaced it with
a shorter, AT-rich intron.  It does not look like point mutations,
but a complete intron replacement.  Pretty weird.  The goat, horse
and human genes are all GC-rich in the second intron and it is tough
to align them.
	Looking at the introns, I'd say that these genomes have been
almost saturated with mutations, both point mutations and insertions/
deletions.  So I would expect that the coding regions were also
saturated with mutations, and then those that were less functional were
selected against and lost.
	However, if I look at "silent" sites in the third postion
of each codon, there are quite few mutations.  Only a few of these
silent sites appear to have been "hit" by mutation.
	Hypothesis 1:  "silent" sites are not truely silent, codon
preference is enough to select for CUU=Leu over CUC=Leu.  If true
it's a wonder anything lives under such stringent selection.
	Hypothesis 2: introns mutate faster than exons.  It is
hard to explain how the DNA repair apparatus is targeted to recognize
the exons.  Harder to explain how damaging agents could target introns.
The most likely explanation is the repetitive nature of the DNA in
introns (lots of AGAGAGA, CCCCCC, AAAAATTTTT, etc) makes it hard
for the polymerase to avoid errors.

	Any ideas?  

	Most of the literature on mutation rates that I can find deals with
exons only, and often proteins rather than DNA.  I would think that someone
would have built a table that says "mouse vs human: 4 base differences
per 100 in exons is mean, high is 40/100 low is 0 per 100, std deviation
is 2 per 100".  Chimpanzee vs human 1 base difference per 100 in exons,..."
and so on.  Also reporting intron mutation rates in the very closely
related species where it can still be scored.
	It is easy to find data on amino acid replacements, why is it
so hard to find data on DNA changes?  Are the tables out there and I 
am just missing them?

*  Brian Foley               *     If we knew what we were doing   *
*  Molecular Genetics Dept.  *     it wouldn't be called research  *
*  University of Vermont     *                                     *

More information about the Mol-evol mailing list