Prion Digest V1 #6
Chris Swanson, Moderator
prion at STOLAF.EDU
Sat Apr 20 12:31:20 EST 1991
Prion Digest Sat, 20 Apr 91 Volume 1 : Issue 6
Origins of Viruses
Date: Thu, 18 Apr 91 14:13:21 -0500
From: swansonc at stolaf.edu
Our mail system seems to have dropped any messages for prion,
prion-request, or prion-archive on the floor yesterday (17 and 18 Apr).
As such, I lost all of these messages. Unless I replied to you
already, please re-send any messages.
One I remember in particular was from a news account (netnews or
something like that) from, I believe nwu.edu (I could be wrong here)
flashing by. If you know of such an account that made a subscription
request this last week, please request them to re-send it with my
-Chris Swansonc (Prion Digest Moderator)
Date: Wed, 17 Apr 91 12:29:10 -0400
From: Daniel Enxing <djex at ll.mit.edu>
Subject: Origins of Viruses
To: prion at stolaf.edu
This message is intended for your new Prion & Virus List.
The following is an attached File item from cc:Mail. It contains
eight bit information which had to be encoded to insure successful trans-
mission through various mail systems. To decode the file use the UUDECODE
[ I uudecoded this text and found that the only 8-bit characters were ]
[ ^M's and the standard MS-DOS ^Z EOF at the end. I removed these and ]
[ replaced the uuencode text with the resulting clear text. ]
[ - Chris Swanson, Prion List Moderator ]
-- Plain text follows this line --
ARE VIRUSES RENEGADES?
Many years ago, in an attempt to understand how a metazoan cell could
possibly do the miraculous things that it does, I came across
John Platt's 'book model' of the cellular machinery ('Horizons in
Biochemistry', Academic Press, New York, N.Y. 1962) which starts
"The expression of genetic information in cells and
whole organisms is like the reading out of a complex
instruction manual, but the analogy extends to more
detail than is generally realized. The information
is linearly arranged in "words" that are "read out"
sequentially in time. There is one copying mechanism
(DNA polymerase) for reprinting the whole book, and
another (RNA polymerase) for selective read-out into
cell chemistry. The read-out is by "paragraphs"
(genes) and by "pages" (operons) that can either be
"closed" (repressed) or "opened" (induced), according
to contingent "instructions" (repressor-corepressor
complexes) from "references" (regulator genes) on
earlier pages or in "books" of adjacent tissues."
Although I admired this paper greatly, I was plain to me that,
with respect to eukaryotic processes, the model evaded a very
fundamental question, namely, "How are the 'pages' turned?".
In my quest for mechanisms that could serve this purpose I was
driven to the conclusion that cellular processes absolutely have
to be 'real-time' processes. One of the unexpected fringe
benefits of a 'real-time' model was that it suggests roles for
some of the so-called "junk code" in the genome.
I wrote a draft paper outlining the idea, but never got around
to having it published. The ideas, apparently, were too outre
at the time, because not many people ( and even fewer biologists)
understood much about information processing -- and those who
knew anything about real-time systems were even sparser on the
ground. It may well be that, as Zola said, 'there is nothing as
powerful as an idea whose time has come' -- but in my experience
'there is nothing as impotent as an idea whose time has not yet
Your announcement about a network for prions and viruses started
me thinking again, along slightly different lines.
The exquisitely orchestrated, very precise, processes carried out
inside the nucleus of an eukaryotic cell demand that the milieu
be very tightly controlled. The chances of a random bit of DNA,
introduced somehow into the nucleus, ever managing to get itself
inserted into the genome in such a way that it is replicated seem
to me to be negligible.
Yet, we know that viruses achieve this routinely. How can this be?
It soon polymerized on me that I already had a plausible
answer within my grasp and that with very minor modifications (to
my original paper, not to the model) one could derive a logically
satisfying (to me, at least) explanation of the origin of viruses.
I propose that the answer might be that a virus is not (and
never was) just a random bit of DNA. It is able to seize control
of the cellular replicating machinery because it has (coded
within itself) "inside information" -- derived, no doubt, from a
renegade ancestor, in a direct line -- who was once a member
in good standing of the organization and so could have been privy
to the detailed information necessary to success in this venture.
The ideas expressed here are entirely original. Any feedback would
be welcome. My network address is:
<djex at LL.MIT.EDU>
JUNK CODE & VIRUSES
In higher organisms the nuclear DNA is complexed with proteins
and some RNA, known collectively as 'chromatin'. Not quite all
the information in the cell is inherent in the chromatin; the
organelles (mitochondria &c.) have to be taken into account too.
For the purposes of this treatment, though, we shall accept
without question the "dogma" that 'all the information needed to
complete the organism, as well as the information that must be
used by the developing organism to commence its interactions with
its environment are inherent in the chromatin complex.'
Information can only exist in a context -- it cannot exist 'in
vacuo'. For information to be of use, it must be retrievable if
and when needed. The role of a library, as a repository of
information (an information base) can be fulfilled only if any
given item of information in it is retrievable on demand. If all
the books in the Library of Congress were to be thrown
haphazardly into a warehouse (or stored tightly packed in crates)
it would no longer be a library and the information per se would
effectively cease to exist.
There is an enormous amount of information in the mamamlian
genome. This information is at the disposal of the cellular
machinery. The cell must, however be able to gain access to
whatever specific information it needs whenever it needs it.
Each of us comes into being from a single cell, a fertilized
ovum. Every time the egg divides, each daughter cell inherits a
complete copy of the genome -- it inherits a portfolio of genes.
Its nucleus contains all the genes which, encoded in DNA, specify
all the different cells in the adult body. At some specific time
during development of the organism each cell specializes; it
becomes a liver cell or a kidney cell or a neuron, say. From
that time on, all its daughter cells will be of the same kind.
The formation of a specialized cell does not result from loss of
genetic material; rather it follows from a change in the reading
of the whole genome -- 'selective gene expression', as it is
Once a cell has differentiated its metabolic behavior is also
determined. Even though the the code in a liver cell, for
example, contains the 'programs' also used by a working kidney
cell, these may normally never invoked by the cell. Only the
code that governs the metabolism of the liver cell can be allowed
to be expressed without serious deleterious consequences.
The nucleus must, therefore, embody specific regulatory
mechanisms capable of activating and deactivating particular
regions of the genome for RNA translation and protein synthesis,
depending on the instantaneous state of the cell. The emergence
of the cellular machinery conferring the ability to express,
selectively, different regions of the genome (i.e. different
code) is what enabled metazoans to arise and evolve.
Some knowledge of regulatory functions in prokaryotes has been
gleaned (e.g the lac operon) but the mechanisms by which the
selection of genetic potential in the eukaryotic cell is
accomplished is still largely unknown and represents one of the
most challenging problems in modern biology. In 1971, in an
editorial in 'Nature', it was declared that "the structure of the
eukaryotic chromosome is the vital issue that must be resolved
before research today in cell biology can produce a coherent set
of concepts instead of a mass of unrelated data." Almost 20 years
later, as far as I can tell, the problem is still largely
It is my contention that prions and viruses represent part of the
regulatory machinery that 'escaped' and mutated. If this
conjecture were to be established as fact, it offers the
possibility that prion- and virus-like artifacts might be used as
'probes' to elucidate the cellular machinery and give us greater
insight into their depredations within the cell.
A CYBERNETIC MODEL OF CELLULAR PROCESSES
It is generally accepted that information is encoded in the
sequences of nucleotides that constitute the DNA in an eukaryotic
cell. The details of the triplet code are now well known and the
process of transcription, during which the encoded information is
precisely translated into complementary strands of RNA that
direct the synthesis of specific proteins is well understood.
The codes for proteins constitute only a part of the genome. One
of the most awkward facts to account for when analyzing the heredity
of higher organisms is their great excess of DNA; the amount varies
with the species, of course, but there always seems to be far more
in the genetic material than can be accounted for by the sum of the
the codons needed for proteins production.
Some stretches of the 'redundant' code are thought to be regulators
which govern the production of protein (analogous to operons in
prokaryotes). In additon there is a large amount of repetitive
code which seems to serve no apparent purpose. Some biologists
refer to this component as "junk code."
In attacking the problem of regulation, the first question is one
of strategy: how should one attempt to resolve the issue?
Since information is the currency of genetic trransactions, it
seems natural to try to consider the problem from an information-
processing point of view.
Nature (if I may be permitted an anthropomorphic metaphor) is a
tinkerer, not a designer. The structures that we uncover are
"Rube Goldberg" contraptions -- superbly engineered and optimised
through the agency of natural selection, but kludges
nevertheless. Experience with computer systems, which are orders
of magnitude simpler, show that it is supremely difficult to
fathom the logic behind such 'ad hoc' constructions "from the
bottom up". Is there perhaps a different approach with greater
Starting with some rather basic assumptions, a case will be made
for selecting a particular information-processing structure. A
model of this structure will be described and some consequences
will be drawn from the given model. Finally, it will be shown
that the cell supports processes similar to those required by the
(1) The physical information structure which resides in the
genomic DNA is LINEAR (or, at most, closed in the form
of a ring); IT IS NOT BRANCHED OR STRUCTURED IN ANY
(2) Processing of the information in the DNA does not start
at some (global) 'beginning' and proceed sequentially
(endlessly) from there on. That is to say, even though
a particular stretch of code is expressible (locally)
sequentially, for a specific protein, the transcription
site for the next product need not necessarily be
adjacent to it.
(3) Nor is it random.
(4) The program embodied in the genome must be responsive
to patterns of input 'signals' from three levels:
These assumptions imply that the logic of the process is BRANCHED
even though the code for the process is LINEAR.
THE CELLULAR PROCESSES HAVE TO BE UNDER THE CONTROL OF THE
LOGICAL EQUIVALENT OF AN INHERENT EXECUTIVE 'PROCESS CONTROL
SYSTEM' WHICH MONITORS INTERNAL AND EXTERNAL SIGNALS, RECOGNISING
AND RESPONDING TO SPECIFIC PATTERNS OF STIMULI ACCORDING TO THE
CURRENT STATE OR CONTEXT OF THE SYSTEM.
FURTHERMORE, THIS EXECUTIVE SYSTEM IS A 'REAL-TIME SYSTEM', IN
THAT IT MUST BE CAPABLE OF IMMEDIATELY SUSPENDING ONGOING
ACTIVITY AND SWITCHING TO A NEW MODE OF OPERATION -- UNDER
CONTROL OF SOME INTERNALLY-EMBODIED PROTOCOL -- WHEN A NEW
STIMULUS IS DEEMED TO HAVE A HIGER PRIORITY THAN THE PROCESS
CURRENTLY BEING EXECUTED.
An example of a real-time reaction might be the response of a
cell to adrenalin.
If we accept Herbert Simon's argument that biological systems
have to be hierarchical because there has not been enough time
for any other kind of system to evolve, then 'execution' of the
'program' embodied in the chromatin could be represented by a
branching structure or hierarchical 'tree'.
Such an executive system would be capable of mapping a linearly-
ordered physical information structure into a logically-ordered
branched 'time-series' of processes. To do this, it must be able
to 'address' specific segments of the program (code) as needed.
For this hypothesis to be viable, it is necessary to show that
there exist mechanisms capable of accomplishing this feat.
Because of the addressing structure (the memory organization) and
the sequential nature of programs in a digital computer, the
executive can invoke a specific process at will by 'pointing' to
it -- transferring control to it by reading the appropriate
address into the program counter. The genome does not appear to
have any such addressing structure; it is, so to speak,
'diffuse'. One way of achieving the desired effect in a diffuse
structure would be to seal off all code except that for the
process called for at the moment.
This seems to be the method that actually evolved. The coiling
and supercoiling of the paired DNA strands are the means whereby
only certain sites are allowed to be active at any time. Code
that is not meant to be expressed at that time is 'hidden'
within the coils -- only code meant to be expressible in the
given context is exposed by the local uncoiling of the DNA
strands. Such an arrangement would call for 'filler', to keep
unexpressible code far enough away to be inaccessible, and this
filler may be an important component of the so-called 'junk'
It is immediately obvious that a liver cell, for example, could
become specialized by sealing off forever (by phosphorylation?)
all code not specific to the metabolism and replication of liver
cells. The executive system could orchestrate cell activity by
opening and closing sections of code selectively. An agent that
interfered with the seals and allowed 'outlaw' code to be
expressed during replication, might cause tumors or cancerous
cells to develop.
Because of the real-time nature of the system, the executive
process itself always has to be 'resident' (i.e. available) to
avoid the condition that programmers refer to as the 'deadly
embrace'. To illustrate this condition, consider the analogous
problem of executing a very large program in a computer with a
disk drive but very limited random access memory. Each
successive program segment has to be read in as needed. The disk
driver -- that process which actually causes the data to be read
from the disk into memory -- always has to be resident in memory.
If it were to be inadvertently swapped out to disk, a 'deadly
embrace' would result because now there would be no way to read
in the next segment. By the same token, the 'code' for processes
that when expressed, cause the currently-open segment of the genome
to wind up and which open the next appropriate segment, always has
to be available when needed. Operationally, this means that
every lowest common denominator of open code has to contain its
own copy of the executive process, so there will have to be a
multiplicity of copies of the executive system distributed
throughout the system.
A 'bug' in the program, (a mutation, that is), in the sequence
coding for a protein may or may not be lethal. If it is not lethal,
it might be neutral or even beneficial -- or it might have delayed
effects, causing complications later (e.g. sickle-cell anemia).
Executive systems, (judging from experience with computer
systems) are far less tolerant of aberrations. An error in the
executive process is most likely to be lethal; so one could
expect the code to be highly conserative. This means that the
multiple copies of the executive system are likely to be very
similar, providing another source of repetitive non-protein-
The executive process itself, furthermore, may not be monolithic,
but may itself need to be distributed. This probably would
entail a good deal of filler, adding to the non-functional
Five (of eight) histones have been isolated and sequenced from
a wide spectrum of eukaryotic species, suggesting that it 'froze
over' very early in the history of eukaryotic organisms. One might
expect that the executive 'machinery' -- playing, as it does,
such a fundamental role in cellular function -- would have had to
have come into being equally early on. Indeed, the histones,
needed for coiling the DNA are very much part and parcel of the
same regulatory processes. It follows that the regulatory machinery
would be equally widespread and at least as conservative. A search
for such invariant processes of repetitive DNA would pinpoint the
sections of the genome that represent the executive system and
isolate them for further study.
Segregation of function in membrane-limited nuclei,
mitochondria and plastids is another hall-mark of eukaryotic
organisms. The separation of genes for complex organellar
elements may be a general principle of organelle and eukaryotic
biology. In the interests of efficiency, probably, these
organelles (e.g. mitochondria) have had some of their functions
(and the associated code) taken over by processes in the nucleus.
Mitochondria, for example, no longer make their own membranes.
Why, then, have they retained any code at all? A plausible
answer is given by this model.
An organelle may have to carry out some functions that cannot be
subservient to the current process in the nucleus. That is to
say, it has to carry out its process irrespective of what is
happening in the nucleus (respiration, for example) at the time.
By executing its own code, independently, it becomes an
'asynchronous' (satellite) processor, performing its appointed
function irrespective of the instantaneous state of contemporary
It seems likely that the logical tree in the DNA has three major
branches, each one controlling a specific function:
The Principle of Parsimony suggests that the developmental and
the replicative processes might share some commom code.
The executive program is more than just a switching network -- it
is a dynamic process. It contains information it uses (and
modifies) to determine its pathways depending on the
Certain predictions can be made from this model:
o The chromatin will contain a large amount of repetitive
code, some of which( filler code) may seem non-
o Some of this repetitive code is functionally equivalent
to an executive process control system
- which is highly conservative, and therefore,
may seem 'primitive'
- will be found (modulo minor variants) across
a very wide spectrum (if not all) eukaryotic
o some of the non-histone chromosomal proteins are not
for export, but are generated solely for control
purposes within the nucleus
- A DNA sequence which generated such a protein, if it
escaped, might be the precursor of a virus.
o some intranuclear RNA may play a similar role. The
role of inverse transcriptase is to make this possible.
- A RNA sequence which generated such a protein, if it
escaped, might be the precursor of a retrovirus.
This model allows for evolutionary change in anatomy and way of
life to be based on changes in the information controlling the
expression of genes as well as point mutations in protein-
producing genes. So it is possible for species such as humans
and chimpanzees to differ so substantially in anatomic detail and
way of life and yet have proteins that are 99% similar.
in care of <DJEX at LL.MIT.EDU>
The "Prion Digest" is a Usenet distributed e-mail list, compiled from
postings to it, and distributed weekly (current plan is for early Sat.
While the main goal of the digest is to provide a resource for
researchers working with prions and interested bystanders, all are
welcome. All articles posted will be included in the next digest. If
a poster feels that his posting is of an urgent nature, it may be
distributed sooner than the regular digest. If you want to post an
"urgent" message send it to the prion-request address, not the prion
All requests regarding administrivia (subscriptions, cancellations,
comments, etc.) should be mailed to the moderator
<prion-request at acc.stolaf.edu>. All postings to the digest should be
directed to <prion at acc.stolaf.edu>.
There are archives of all back issues available via anonymous ftp from
beowulf at acc.stolaf.edu (22.214.171.124) in the pub/prion directory. If
you do not have ftp access, please write <prion-archive at acc.stolaf.edu>
and back issues will be mailed to you.
-- Chris Swanson (Prion Digest Moderator) <swansonc at acc.stolaf.edu>
End of Prion Digest
More information about the Proteins