Oh foolish supporters of genome sequencing

frist at ccu.umanitoba.ca frist at ccu.umanitoba.ca
Tue Feb 12 13:08:03 EST 1991

In article <5695 at husc6.harvard.edu> Ellington at Frodo.MGH.Harvard.EDU (Deaddog) writes:
>I am sure that I will see justification for the project at some point, 
>but so far none has appeared.  I am truly mystified:  given the ability to 
>clone genes by a variety of methods (probing with oligos based on protein
>sequences, complementation, hybridization with homologous sequences,
>panning, subtraction cloning, etc.) I do not see a need to sequence
>the human genome.  I do not see what information will be gained that cannot
>be garnered by other, more directed means.  In terms of genetic disease, 
>for example, the search for a given gene would still seem to be a directed 
....deleted stuff about genome project competing for funds w/other science
... deleted stuff about sequencing at random
>do, and we don't have to worry about the 95% of the genome that is essentially 
>junk;therefore, the sequence of X's genome is just a way to get at a glut of 
>information that we can immediately interpret"), these arguments do not seem
>to apply to humans.

But how do you KNOW it's junk? There are still lots of chromosomal
functions that repetitive DNA could play, but we wouldn't have been able to
detect for lack or a global picture. Such functions might include
higher-order packaging of chromatin during chromosome condensation,
facilitation of chiasma formation,  attachment to the nuclear matrix or
the definition of functional "domains" [Bodnar, 1988]. It should be obvious
now that gene expression is _not_ just a question of having the right
promoter or enhancer. These are clearly only one component of a much more
complex mechanism for genome function that is made possible by the
deliberate organization of the chromatin in the interphase nucleus.

The importance of repetitive DNA is further underscored by evolutionary
studies such as those done by Narayan with the plant genera Clarkia,
Nicotiana, Lathyrus and Allium [Narayan (1985), Narayan (1982)]. Basically,
Narayan has demonstrated that, in a wide range of genera examined,
interspecific differences in genome sizes are in discrete increments,
rather than being a continuous variation. Similarly,  Flavell and colleagues
have used hybridization kinetics to demonstrate that specific subsets of
interspersed repetitive sequences can be selectively lost or gained as spec-
iation occurs [Flavell et al. 1977].  I am a firm believer in "junk
DNA" and I do think that a lot of the genome is junk. But not all of that 95%,`
or whatever figure you wish to use.

The Genome Project is an exploratory mission, like Darwin's voyage on the
H.M.S. Beagle. Although Darwin was perhaps perceived as an unimportant crewman
on this exploratory trip, the wealth of data obtained by Darwin served as his
'database' for thinking about biology from a global perspective. At first,
his work was merely 'bookkeeping'.  Who really cared how many types of
finch were on each island?  Nonetheless, it was these years of observation
that made it possible for Darwin to obtain the global perspective necessary
for the formulation of the most fundamental theory in biology: the theory
of evolution.  
>Even in this form, though, it is not sequencing "the genome" that is 
>important, but sequencing "all the genes that we know are important."  It
>seems as though anthrocentricity is driving this project, rather than a true
>quest for scientific knowledge.
Again, how do you KNOW which genes are important? At the present time, the
sequence databases are largely biased towards highly-expressed genes,
because those are the ones most likely to be detected and cloned. These are
important, in the sense that the cell needs thousands of copies of their 
transcripts. But it is often the case that <25% of the mass of the mRNA
represents >95% of the sequence diversity.  There are literally thousands
of genes for which only 1-10 copies of the transcript are present in the
cell [Okamuro and Goldberg, 1989]. It is only by obtaining a clear view of
what the cell does with these rare transcripts that we will have a complete
understanding of how gene expression results in a differentiated organism. 
>Consider:  I remember when the sequence of all of SV40 was first 
>determined; I also remember when the chloroplast genome was completed.  What
>new lines of research have been opened up by this information?  New lines
>mind you, that wouldn't have been available had not the sequence of the whole
>genome been available.  For example, I am grateful for the sequences of
>additional Group I introns from the chloroplast genome, but a more diverse
>range of sequences has become available from directed searches of many
>different genomes.  
It took Darwin more than 20 years after his voyage to understand the
meaning of the data he had collected, and publish Origin of Species. Getting
the data is the easy part, and will take a finite time. Understanding it
will take generations.  I like to compare molecular biology to planetary
astronomy.  There is so much data from space probes (eg. Ranger, Surveyor,
Mariner, Viking, Voyager, Pioneer) that it is reasonable to do an entire
PhD thesis without ever touching a telescope. Sadly, much of this data is
virtually inaccessible because NASA has not been able to do anything with
it other than store the tapes in vaults. The advances that our database
managers are making with biological sequence data should serve as a source
of optimism that the same thing will not happen to newly-obtained sequence
>And if you defer to thinking about the sequence of the human genome as a 
>tool, then it is a very, very expensive tool.  And again I would suggest 
>that finding the mechanism of one gene which causes MS is worth much more 
>than knowing  the sequences of all the genes together (when you don't know 
>what the genes do and still have to go back and find out). 
Yes, it is cheaper to clone one gene than to sequence the human genome. But
there are several hundred known genetic diseases in humans. With the
entire genome sequenced, we will have all of them. It seems likely to me
that sequencing the genome will be cost effective, as compared to hundreds
of separate projects to clone individual genes. 

Flavell, R.B., Rimpau, J. and Smith, D.B. (1977) Repeated sequence DNA rel-
ationships in four cereal genomes. Chromosoma 63:205-222.

Narayan, R.K.J. (1985) Discontinuous DNA variation in the evolution of
plant species. (Indian) J. Genet. 64:101-109.

Narayan, R.K.M. (1982) Discontinuous DNA variation in the evolution of
plant species: the genus Lathyrus. Evolution 36:877-891.

Okamuro, J.K. and Goldberg, R.B. (1989) Regulation of plant gene expression
in Stumpf, P.K. and Conn, E.E. (eds.) The Biochemistry of Plants Vol 15
"Molecular Biology". Academic Press.

Query: Is Deaddog (Non-woof) really playing devil's advocate here,
challenging supporters of the genome project to justify it in a better way
than has been done up to now?  

Further query: Was Darwin doing 'real science' during his voyage on the
Beagle, or did that come only later, as he tilled the 'vegetable mould' in
his garden, pondering the meaning of 'certain facts in the distribution of
organic beings inhabiting South America'.?
Brian Fristensky                | What can literature do against the pitiless 
Department of Plant Science     | onslaught of naked violence? Let us not for-
University of Manitoba          | get that violence does not and cannot flourish
Winnipeg, MB R3T 2N2  CANADA    | by itself; it is inevitably intertwined with
frist at ccu.umanitoba.ca          | LYING... Lies can stand up against much in
Office phone:   204-474-6085    | world, but not against art.
FAX:            204-275-5128    |     Alexander Solzhenitsyn, NOBEL LECTURE 

More information about the Biomatrx mailing list