Joe Felsenstein (joe at evolution.genetics.washington.edu) writes:
(Re the "C paradox")
>It used to be a paradox, before we knew about junk DNA, transposons, etc.
>There was far more DNA in genomes than we could account for by genes.
>That was the paradox. There was also a population-genetic version of the
>paradox too. Calculations of the mutational load showed that if all that
>DNA was information-bearing, given reasonable per-base mutation rates,
>we'd all be dead.
>>The discovery of all the different kinds of noncoding DNA resolved the
>paradox. Interestingly, the mutation-load argument is still relevant and
>is a fairly powerful objection to most of our DNA being meaningful, in
>the sense that it must be in that particular sequence.
I think I probably down-played the relevance of the 'C paradox' in
my earlier post, by describing it as "just an interesting
phenomenon". It is actually a *very* interesting phenomenon for
the reasons Joe gives above. So I'm going to come back on what Jeff
Mattox said about my reply:
>Shane McKee <shane at reservoir.win-uk.net> wrote:
>>... it is a fallacy to assume that "more DNA"
>>means "More complex DNA". Much of salamander DNA is in the form of
>>simple repeats. In this sense, the sentence "To be or not to be"
>>is more complex than "asasasas dfdfdfdf ghghghghgh hjhjhjhjhj
>>klklklk" for example. From all the crap in a salamander genome,
>>you could compress the information required to "make" a salamander
>>into a much smaller space than that to "make" a human.
>Tell me more about those repeats. Where are they in the DNA (between
>genes, introns, exons)? How long, how many?
There are stacks of them, varying from two to hundreds of
basepairs, found scattered liberally all over the genome - within
introns, between genes - you name it. A few may even get
translated (eg in Huntington's disease - correct me if I'm a bit
behind the times here, folks)
>Why do you say "asasasas" is more complex than "to" (other than being
>longer) or a sequence like "gjepffos"?
You can write "asasasas" as "4(as)" whereas "to" is "to".
Therefore, in this example, the former requires more information to
code it meaningfully. Thus it is more complex.
(I guess we are using the whole
>alphabet here instead of just actg -- I'd prefer to compare DNA sequences
>like "actactactact" vs. "tgacgtggacta". In that light, why is DNA
>"aaaaaa" more complex than, say, "actgtc"?)
I don't think it is: 6(a) is less complex (in information terms)
than 1(actgtc) - it takes less information to code it.
Part of the problem is that DNA coding is not a tight system -
there is a heck of a lot of redundancy in it. This makes it more
resilient to damaging mutation, and also that the information in
human DNA (which is about 750MB in the haploid genome) is probably
compressible in information terms to about 20MB of raw information
necessary to build a human. It might compress even more - what do
you think, Joe?
>The paradox is that the amount of DNA or the number of genes do not
>correlate with "complexity." Genetic complexity and physiological
>complexity are not tightly coupled.
Well, you've got a point here, but bear in mind what you said
in a previous post:
> ...there is little relationship between the genetic
>complexity of a genome and the organism for which it codes.
>Salamanders, for example, have 50 times more DNA than humans.
But you've just said that amount of DNA does NOT correlate with
genomic complexity, so the above statement is a wee bit
meaningless. (Or is that what you said?) One way of looking at it is
this: What percentage of a salamander's genome can you screw around
with, and still get a good salamander out the other end? My
contention is that it's a heck of a lot more than in a human. Is
there an index of this for different species - because it would
probably give a far better estimate of genomic complexity than mere
amount of DNA.
Shane McKee, Belfast, Northern Ireland, United Kingdom
===========Give us back our ceasefire. Now.===========