In article <9109070212.AA19281 at bambi.ccs.fau.edu> tomh at BAMBI.CCS.FAU.EDU
>>I still wonder about totally neutral mutations. What would be an example?
>How do we know a large proportion of bases are silent?
Actually, neutral mutations are quite common. If we take a look at the
same gene from mice and men, we see that there are regions that are
virtually identical that are flanked by regions that are random. The
regions that are identical include the coding regions of genes and their
respective promoter/enhancer regions.
For example, if we look at the 5' regions of the human and mouse HMG-14
genes (capital letters represent regions of sequence that are 100%
homologous between the two species):
Human ggatccccagcactttgggaggccgaggcgggaggatcgcttgagcccaggagtcggaga 60
Mouse gaattccctgtcctaaatacaagttttatggtaggcaaagacatatatgataattcttac 60
Human ccatgctgtgcaacatagtgaaaccccctctctacaaaaaatacaaaagttagctgggca 12
Mouse aattgagcagcagtaatacggctgcgggtgggactcttgtacccagcccgctatctgcat 120
Human aaatggtgccctgtggtcccagctactcgggaggccggagtgggaggttcgctggagccg 180
Mouse tcttgaggtatactcaatattaactgccttccataaacaaaaggtgcaatctggaagtcg 180
Human aaggggtcgaggctgcaatgagccgtgatcgcaccactgcactccagcctggacgacaca 240
Mouse gagctagacccgacgccgctgcgcgaaggactgggtctgtaaagggccgatgtgattcct 240
Human acgagaccctgtctcaaataaaaggaaggaaggaaggaagttacacagaaaggccgcgtc 300
Mouse gcacacagttgaacacaggacggaggggtccatgcagtggggagggaagagccgagacag 300
Human gcgtctccgtcccacgccctcctgcagcgcctgcgcaccaggcccgcttcacgcaggcct 360
Mouse ctgcaagacctaaaacagacagacatagttacctaaaaataggctagtaaaggcgcgtag 360
Human gcgaagctggagcccctggatagcctttcttgccgacagaggcgggagaaatttgctact 420
Mouse gacagcctgagtctgcgcgaccggcccgcgcgcgcctgcgcactgctgctgcgcctgtcg 420
Human tcctgtataccttatccttctcccttcccagtctaagatacgaactataaatgttcgaac 480
Mouse ctcccgggcgaatggcaaatcacgcctaggcactgctgaacccgaggggtccgcgggtgg 480
Human ccaattcaccccggagaggggccagataccagtggcctgaaggcgcccaggtatccagaa 540
Mouse actccgccgggcgggcgcctggggtccggagcattgtgggaaagcgatgctgaggccacg 540
Human gaattgtgggtggggacccgcggtcgtgacgtgcgtccgccaATCAGCGCGCAGaCCGCA 600
Mouse tgacacgcccccactt--------------------------ATCAGCGCGCAGgCCGCA 574
Human C-TTTGCgcTCGGCTTCAAACtAccgtgagccggagcgcactgggaccccgcccccttcg 659
Mouse CtTTTGCtgTCGGCTTCAAACaAactacctagcggggtcgggcgactccggcccgcccct 634
Human cctgggtctggggccccgcgagacggcggaaagGGGTGGGGGcgCccGGGGgGGcGgGAg 719
Mouse cggccggggtcgcgcccgca-------------GGGTGGGGGgcCagGGGGcGGgGcGAt 681
Human gggggcggggtgcgggatcGAGTGACgGCcCGCCtcaCCTATTCcGGgcgc-GGGCTGag 778
Mouse gcggcg-------------GAGTGACaGCtCGCCgtgCCTATTCgGGagcaaGGGCTGgc 728
Human tcccgt-------------------AGCCAATGGgcgggggtgGGGGGCGGCCCGGCcGG 819
Mouse tggcgcgtacgcggtgggcctagacAGCCAATGGaggct----GGGGGCGGCCCGGCtGG 784
Human CGGGGAGGgggagccgcggccgggaCGCgGGGGGagGAGGaGGCGGGCTCCcAATCCGGT 879
Mouse CGGGGAGG-----------------CGC-GGGGGgaGAGGgGGCGGGCTCC-AATCCGGT 825
Human TCCATCCGGTTCTCCCACCGCCCCCGCtgtGGGTCtCAGCAGcTCGggcggcgggaggag 939
Mouse TCCATCCGGTTCTCCCACCGCCCCCGCgacGGGTCgCAGCAGtTCGtgtggtggtggcgg 885
Human tggcagcggcaaggcagcccagtttcgcgaaggctgtcggcgcgccgcggcccgcaggca 999
Mouse cggcggcttggcagtgcggctcctcggtgacagatccgaca------------------- 926
Human cccggcacgcgccttccccgcaggcacccggcacgcgccttcccCGCCGCCACGATGCCC 1059
Mouse ------------------------cgcacgcgtctcccaccccgCGCCGCCACGATGCCC 962
Human AAGAGGAAGGTGAGcggcggccgcggcccgcacacgccccctggagccgccgccggcccc 1119
Mouse AAGAGGAAGGTGAGtcggcggggccgcggcgccgcagggttcgggtctgaggggctctgg 1022
Human cgccggccccgcgaggcccaggccccgttgcacccacggtggcgacgggcccgggaggcg 1179
Mouse aatcttcgcggg------------------------------------------------ 1034
Human cttggagaccggcgggcgggcaggcgagcgctcggcggccgcgggggcggcgttctggaa 1239
Mouse ------------------------------------------------------------ 1034
Human cgtttggcggccgggggagctgagggggctattcgaacGgGgcGGCGGgaaGcCGTGACG 1299
Mouse --------------------------------------GtGcgGGCGGctgGgCGTGACG 1056
Human TCACgcGGCCggGCATTGTTCTCggggccgggcgggcccgcgagtcctgggactgcggcc 1359
Mouse TCACcgGGCC--GCATTGTTCTCcgctgtgctttct------------------------ 1090
Human cgcctctattcgtgcgtctccgtctcc---GCAGGTcAGCtccGCcGAaGGcGCCGCCAA 1416
Mouse ---------------------------cctGCAGGTtAGC---GCgGAtGGaGCCGCCAA 1120
Human GGaaGAGGTGAGTGCGGGcCtTCtGcgGGGggtggtgggtttcccgtgagccgctggcct 1476
Mouse GGcgGAGGTGAGTGCGGGgCcTCgGgtGGGccgggcggatcggggcgggcggtggggtgc 1180
Human gccttctcttctcgctgactctcctttttctttctccaAGCCCAAGaGgaGaTCgGCGcG 1536
Mouse cgctcacatgcgctgctcacccg--tctttctctccgcAGCCCAAGcGccGcTCcGCGaG 1238
Human GtTGTCaGCtGTAAGTAaaGCGagccccgtaaccgttcgttttccgcgggtcgtcccggg 1596
Mouse GcTGTCgGCcGTAAGTAccGCGctcggtccgggccgggacgggagcgagcgggccgggcc 1299
We find in the two sequences long spans of apparently non-homologous sequences
(for example bases (human) 1-580) whith sequence similarity of <50% which flank
blocks of sequences with much greater (>90%) sequence similarity. The sequences
that are identical represent the promoter elements and the coding regions of
the gene. The non-similar regions correspond to flanking regions and non-
coding introns. As you can see, the regions of DNA which code for the protein
are highly conserved and the other regions represent the totally silent
mutations that you were asking for an illustration of. The sequence similarity
between human and mouse HMG-14 is typical.
Although you might be concerned about comparing mice and men, such comparisons
are the best way of determining what regions of DNA are important and that
are constrained in being able to be mutated.
>>Tom Holroyd
>Center for Complex Systems
>Florida Atlantic University
>tomh at bambi.ccs.fau.edu
--
Donald A. Lehn, Ph.D. Phone: (301) 496-2885
Bldg.37 Rm 3D20 FAX: (301) 496-8419
National Cancer Institute / NIH Email: donnel at helix.nih.gov
Bethesda MD 20892