For the sake of clarity, it should be explained the
trees A and B, of Moran's posting, differ only in the
placement of the root. Without a root these trees are
topologically identical. Thus the central issue in Moran's
posting could be phrased as simply: "where does the root of the
HSP70 tree fall?".
Unfortunately, I think that given the data that Moran and others have
offered, it is an issue which cannot be adequately addressed. In
order to justifiably root this tree with another sequence, one must have
reason to believe that the gene duplication giving rise to the pair
of genes preceeded the divergence of all of the organisms in question.
Since all of the sequences described as possible outgroups to these
trees are found in only ONE of the three "urkingdoms" of life (MreB,
the Ecoli hsc70 and the others mentioned), we have no strong data upon
which to base this assumption. All of these potential paralogs of
hsp70 could owe their low sequence similarity to an increased rate
of non-synonymous substitution following gene duplication in that
urkingdom; in other words distant sequences don't always imply ancient
gene duplications gave rise to them.
Rooting the trees by trying to find a midpoint (in effect simply looking
for pair of urkingdoms share the most similarity in hsp70 sequence) is
plagued by the problem of unequal rates of sequence evolution in
different lineages. A systematic increase in the rate of substitution
in the eukaryotic lineage, for example, could turn tree A into tree B
(see Moran's posting for these dendrograms) using the midpoint method.
Even if the putative paralogs of hsp70 turn out to really be the result of
an ancient gene duplication, the problem may not be clearly solvable.
Rooting trees with very divergent sequences may yield a tree abberrantly
rooted along the longest branch of the tree- long branches attracting
may be a serious problem.
I agree with Moran that the scenario that Gupta and Golding have
suggested regarding the region where insertions occur in the N-terminal
quadrant of hsp70 is unlikely. Given the similarity in tertiary structure
of of the two halves of the N-terminal quadrant of hsp70, plus
limited sequence similarity, it does indeed appear that the two halves
are duplicates. The N-terminal part of hsp70 could be diagrammed as follows:
X and X' have identical secondary structures as do Y and Y'.
These regions are separated by regions where there is no similarity
between the tertiary structures between the halves (indicated by dashes).
The 25 amino acid insertion occurs in the first dashed region.
Alignment of the two halves in this dashed region is ambiguous. Thus
it is unclear as to whether the second half has a region homologous to
the 25 amino acid insertion or not. Similarly, the alignment of
MreB to hsp70 in these regions is ambiguous- one cannot tell whether
the relative shortness of the MreB sequence is due to the lack of this
insertion or not.
Finally, the evolutionary scheme for the origin of eukaryotes from the
fusion of a gram -ve bacterium and an archaebacterium seems to be a
rather complex scenario to explain several curious gene phylogenies.
It still does not explain why the archaebacteria do NOT form a clade
as a sister group to the gram +ves. Some of the intrinsic appeal for
such a scheme may come from the success of the endosymbiont
theories for the origin of mitochondria and chloroplasts. Many see
the nucleus as a third membrane-bound organelle which could have a
similar bacterial past. This is a fallacy- it is not membrane-bound
at all. It is like a specialized form of ER which partially encloses
the chromatin. The two nuclear membranes are topologically a single
membrane which is continuous with the ER. There is no evidence that
these membranes have a genetic affinity to a gram -ve eubacterial
endosymbiont (as is the case for mitochondria and chloroplasts).
Dept. of Biochemistry
aroger at ac.dal.ca