HSP70 evolution

L.A. Moran lamoran at gpu.utcc.utoronto.ca
Mon Jan 16 13:30:50 EST 1995


Two different versions of HSP70 dendrograms have been proposed. The tree
shown in Figure A is favored by Gupta and his colleagues (see Gupta and
Golding, 1993, Gupta and Singh, 1994; Falah and Gupta, 1994) while the
topology proposed in Figure B is shown in Boorstein et al. (1994) and Rensing
and Maier (1994). My own dendrograms and those constructed by Sharon Shtang 
in my lab (Shtang, 1994) are like tree B.


          |---- BiP                            |------ BiP
        |-|                                 |--|
        | |---- hsc70/hsp70                 |  |------ hsc70/hsp70
     |--|                                   |
     |  | |---- organelles                --|    |---- organelles
     |  |-|                                 |  |-|
   --|    |---- gram neg.                   |  | |---- gram neg.
     |                                      |--|
     |  |------ archaebacteria                 | |---- archaebacteria
     |--|                                      |-|
        |------ gram pos.                        |---- gram pos. 

              A                                      B

I have simplified these dendrograms by not showing that the HSP70's from
mitochondria and chloroplasts cluster *within* the gram negative group and
by suggesting that the archaebacterial sequences form a monophyletic group.
It is very important to keep in mind that the data on archaebacterial dnaK
sequences shows clearly that there is no such category as "archaebacteria"
that is distinct from gram positive bacteria. This fact must be explained
by any theory that tries to rationalize the HSP70 trees with those from
ribosomal RNA.

The two trees are profoundly different. Tree B suggests that the ancestral
HSP70 gene was present in an organism that gave rise to all prokaryotes on 
the one hand and all eukaryotes on the other. In other words, the primary
division of HSP70 genes is eukaryotes/prokaryotes.

Tree A suggests that eukaryotic HSP70 genes are more closely related to
those from gram negative bacteria and that the primary division of HSP70
genes is gram neg.+eukaryotes/gram pos.+archaebacteria. Tree A is explained
by proposing that the ancestral HSP70 gene resembled that seen in modern
archaebacteria and gram positive bacteria and that the eukaryotic HSP70
genes are derived from a gram negative bacterium that fused with a primitive
archaebacterial cell to form the first eukaryote.

Although I have shown a root for these two trees, the actual dendrograms
are usually unrooted. However, it is possible to root the trees using a
distantly related sequence such as that from beet yellow virus. Rensing
and Maier (1994) have done this and their data supports tree B. However,
the similarity between the beet yellow virus sequence and HSP70's is very
low and only covers the N-terminal half of the gene. Thus one might question
whether a valid root of the dendrogram can be obtained by using this gene
as an outgroup. 

There are many other distantly related genes that can be used to root the
HSP70 dendrogram: the mouse hsr.1 gene; the human stch gene; three different
yeast genes identified by sequencing yeast chromosomes (YKL073w, H8025.17,
L8039.4); and the sperm receptor group. The sperm receptor group consist of
the following homologous (orthologous?) genes; sea urchin sperm receptor,
yeast Msi3P (SSE1, SSE2), human hsp70H, and a gene from C. elegans. Some of
these genes also exhibit low levels of overall sequence similarity but the
regions of similarity cover almost the entire conserved length of HSP70. They
are much more similar to HSP70's than the bacterial MreB proteins. Using these
genes as outgroups generates tree B. I assume that Gupta and his colleagues
will argue against using these genes as outgroups to root the tree but I will
be very interested to see their rationale.

There is a better outgroup sequence. A second member of the HSP70 family in E.
coli has been cloned and sequenced by two groups (Seaton and Vickery, 1994;
Kuwala and Lelivelt, 1994). These sequences have been available since March of
last year. The reported amino acid sequences (Accession nos. U05338, U01827)
differ at a few sites but these differences do not affect the phylogenetic
analysis. I will call the second E. coli gene hsc70 but it is also known as
hsc or hscA. Gupta and Singh (1994) claim that this second HSP70 gene is
closely related to the dnaK gene and that it probably evolved relatively
recently by a gene duplication. My analysis of the E. coli hsc70 sequence
suggests that it is, in fact, very distantly related to all known HSP70 genes. 

When E. coli hsc70 is used as an outgroup the resulting dendrogram shows that
all eukaryotic proteins form a single cluster that is distinct from the
prokaryotic sequences. The gram positive/archaebacterial proteins group with
the other prokaryotes. In other words, tree B and not tree A is correct. It is
difficult to reconcile this result with the proposed relationship between gram
negative bacteria and eukaryotes as proposed by Gupta and his colleagues.

There are other ways of determining the robustness of the two different
dendrograms that do not involve using a distantly related sequence. Tree A
predicts that all of the eukaryotic sequences will be more similar to
sequences from gram negative bacteria than to sequences from gram positive
bacteria and archaebacteria. Tree B predicts that all prokaryotic sequences
will be more similar to each other than to sequences from eukaryotes.

Visual inspection of the entire aligned sequence database demonstrates clearly
that all prokaryotic sequences are closely related and that they differ
significantly from the eukaryotic sequences. This data supports tree B.
Examples of selected regions (signature sequences) are shown in Gupta and
Golding (1993) and Gupta and Singh (1994) but there are many more eukaryotic
vs. prokaryotic signatures in addition to the ones that they show. A table of
identity scores is presented in Boorstein et al. (1994). Comparing prokaryotic
and eukaryotic sequences shows identity scores that cluster in the narrow
range of 45-55% while no two prokaryotic sequences are less than 56% similar.
(The vast majority of scores between prokaryotic sequences are greater than
60% even when comparing the gram neg.+organelle cluster with the gram
pos.+archae cluster.) Such scores are not consistent with Gupta's tree A but
they are consistent with tree B.

Comparisons between prokaryotic and eukaryotic HSP70's are complicated by
the presence of a region near the N-terminus where several deletions and
insertions have occurred. The genes from gram positive bacteria are missing
25 codons that are present in most other genes. Archaebacterial sequences
are missing 24 codons. Some other bacterial genes have 2 or 3 fewer codons
than the average eukaryotic gene. Some eukaryotic genes have 26 codons in 
this region.

Gupta and his colleagues believe that the ancestral gene had the fewest codons
and that the genes from gram negative bacteria and all eukaryotes have
acquired an insertion in this region. This is their main argument in support
of tree A - they tend to discount the overall sequence similarities and
signature sequences in spite of the fact that their original tree resembled
tree B because;

     "...the archaebacterial and eubacterial HSP70 sequences are much
      more closely related to each other than to the eukaryotic
      lineage." (Gupta and Singh, 1992) 

In a later publication (Gupta et al., 1994) they claim that;

     "...it is clear that all of the eukaryotic HSP70 homologs ... share
      a number of sequence features in common with the Gram-negative group
      of bacteria. The most prominent of these features is the presence
      of a relatively conserved insert (sic) of between 23 and 27 amino
      acids in the N-terminal quadrant following the sequence KRLIG, which
      is not found in any of the homologs from archaebacteria or Gram-
      positive bacteria. The presence of various unique, shared sequence
      features between these species and detailed phylogenetic analyses
      of the HSP70 data (the results of which are not affected by excluding
      this region) provide strong evidence that the eukaryotic HSP70
      homologs have evolved from a Gram-negative eubacterial ancestor."

I do not agree. My analyses show that all prokaryotic sequences cluster
together in a monophyletic group that is distinct from the eukaryotic
sequences (tree B). This result is obtained even when the region containing
the insert/deletion is included but it is much more apparent when this
region is excluded.

Tree B is also consistent with the sequences in the region containing the
deletion/insertion but in this case it is assumed that the ancestral gene
contained the extra codons and these have been deleted in the cluster
containing gram positive bacteria and archaebacteria. Recall that tree B is
also consistent with the overall sequence similarity but tree A is not. 

Gupta et al. have advanced two arguments in favor of their interpretation.
They claim that the HSP70 ancestral gene must have been missing the 25
codons because:

     1. The genes from gram positive bacteria and archaebacteria show
        evidence of an internal duplication in the first two quadrants.
        These sequences can only be aligned when the extra 25 amino acids
        are ignored. Thus, the ancestral gene was constructed by
        duplication of a primordial sequence followed by the addition
        of several inserts and the C-terminal half of the gene. Later on 
        25 codons were inserted into the first quadrant in the genes from 
        gram negative bacteria and eukaryotes.

     2. The MreB genes reveal "highly significant similarity" to the
        first half of the HSP70 sequences if the extra 25 amino acids
        are ignored. Thus, the MreB genes must have evolved from the
        HSP70 ancestor after the presumed duplication and insertion of
        several short sequences but before the addition of the C-terminal 
        half of HSP70. Since the MreB genes do not have the "insert" this 
        must represent the primitive sequence and the extra codons have 
        been acquired later during evolution in gram negative bacteria 
        and eukaryotes.

Both of these arguments can be contested. The evidence for an internal
duplication is weak and it should not be considered to be a proven fact. The
evidence that MreB genes are homologous is also not strong. These genes may
not share a common ancestor with HSP70's. Furthermore, all of those genes that
seem truly homologous to HSP70's (beet yellow virus, the mouse hsr.1 gene, the
human stch gene, three different yeast genes identified by sequencing yeast
chromosomes (YKL073w, H8025.17, L8039.4), the sperm receptor group, and
especially E. coli hsc70) contain the extra 25 amino acids suggesting that the
ancestral gene was larger and the region was deleted in the gram
positive/archaebacterial cluster. In addition, there is no evidence of an
internal duplication in the MreB genes as predicted by Gupta's model.

I conclude that the available evidence strongly supports tree B and
contradicts the proposal that eukaryotic HSP70 genes are more closely related
to those from gram negative bacteria. In other words there is no evidence from
analysis of HSP70 genes to support the idea that eukaryotes have arisen from
an ancient fusion of an archaebacterium and a gram negative bacterium.


Laurence A. Moran (Larry)


Boorstein, W.R., Ziegelhoffer, T. and Craig, E.A. (1994) Molecular
    evolution of the HSP70 multigene family. J. Mol. Evol. 38, 1-17.

Falah, M. and Gupta, R.S. (1994) Cloning of the hsp70 (dnaK) genes from
    Rhizobium meliloti and Pseudomonas cepacia: phylogenetic analysis of
    mitochondrial origin based on a highly conserved protein sequence.
    J. Bacteriol. 176, 7748-7753.

Gupta, R.S. and Sing, B. (1994) Phylogenetic analysis of 70kD heat shock
    protein sequences suggests a chimeric origin for the eukaryotic cell
    nucleus. Curr. Biol. 4, 1104-1114.

Gupta, R.S. and Singh, B. (1992) Cloning of the HSP70 gene from Halobacterium
    marismortui: relatedness of archaebacterial HSP70 to its eubacterial
    homologs and a model for the evolution of the HSP70 gene. 
    J. Bacteriol. 174, 4594-4605.

Gupta, R.S., Aitken, K., Falah, M. and Singh, B. (1994) Cloning of Giardia
   lamblia heat shock protein HSP70 homologs: implications regarding origin
   of eukaryotic cells and of endoplasmic reticulum. PNAS 91, 2895-2899.

Gupta, R.S. and Golding, G.B. (1993) Evolution of HSP70 gene and its
   implications regarding relationships between archaebacteria, eubacteria,
   and eukaryotes. J. Mol. Evol. 37, 573-582.

Kawula,T.H. and Lelivelt,M.J. (1994) Mutations in a gene encoding a new hsp70
    suppress rapid DNA inversion and bgl activation, but not proU
    derepression, in hns1 mutant Escherichia coli. J. Bacteriol. 176, 610-619.

Rensing, S.A. and Maier, U.-G. (1994) Phylogentic analysis of the stress-70
    protein family. J. Mol. Evol. 39, 80-86.

Seaton,B.L. and Vickery,L.E. (1994) A gene encoding a new DnaK/hsp70 homolog
    in Escherichia coli. Proc. Natl. Acad. Sci. U.S.A. 91, 2066-2070. 






More information about the Mol-evol mailing list