Homology/similarity/identity: proper usage.

William R. Pearson wrp at biochsn.acc.Virginia.EDU
Fri Feb 1 12:06:01 EST 1991

In article <11223 at uhccux.uhcc.Hawaii.Edu> ronald at uhunix1.uhcc.Hawaii.Edu (Ronald A. Amundson) writes:
>In article <3824 at gazette.bcm.tmc.edu> steffen at mbir.bcm.tmc.edu (David Steffen) writes:
>>  I am again struggling with the proper use of the words "homology",
>>"similarity", and "identity" in comparing sequences.  ...
>problem here.  The term "homology" clearly is being used differently
>in molecular genetics from its usage in traditional evolutionary
>biology.  Steve Gould comments on the issue in his Natural History
>column for Feb. 1988, BTW, wishing that the molecular biologists would
>talk more like macro-biologists.  

 I do not believe that the use of the term is different

>The problem with calling identical molecular sequences "homologies" is
>not _just_ that it implies a common source for the two sequences. 

	My understanding is that homology implies common
	ancestry (source) - nothing more or less.

>1)  Good macroscopic evolutionary inferences of homology are based on
>"shared derived" characteristics.  The nests of other sets of traits
>disallow certain similarities to count as homologies.  Mere similarity
>alone can never be used to judge two traits as homologous.

	I would argue that while "mere" similarity is
	insufficient, there are levels of similarity that
	allow one to infer homology and never be mistaken.
	Everyone accepts that two sequences that are 100%
	identical are homologous, and clearly one should
	not feel too uncomfortable with two sequences that
	are 90% identical (over their entire length).  The
	issue arrises when two protein sequences share
	less than 20 - 25% identity.  But there are a
	series of tests, based on sequence similarity
	alone, that make it very unlikely that the
	inference of homology is incorrect.

> (Unless
>I'm wrong) the "mere similarity" (i.e. molecular identity or
>similarity, in the absence of evidence provided by other hierarchies
>of traits) of molecular sequences is used as a sufficient criterion
>for the term "homology" in molecular genetics.

>2) It seems to me (insert disclaimer again) that when molecular
>biologists call sequences homologous, they mean that the two were
>copied from a similar ancestral _molecular sequence_.  But the
>processes of copying molecular sequences are not identical to the
>processes of reproducing organisms.  As I understand it, sequences can
>be copied within a genome, and with manipulation (and maybe some kinds
>of viral infection and other exotic stuff) between genomes.  So the
>geneological tree connecting up similar sequences with their molecular
>ancestors will not be isomorphic with the geneological tree connecting
>organisms with their ancestors.  

	Here, the terms "homologous" and "orthologous" are being confused.
	Two sequences are homologous if they share a common ancestor, no
	matter how complex and exotic the evolutionary path between that
	ancestor and the present.

>So it looks as if the molecular use of "homology" is a _different_ use
>from the normal evolutionary use of the same term.

	This is not true. We all agree on common ancestry.
	We may disagree on the amount of evidence required
	to support the assertion of common ancestry, but
	we all mean the same thing.

Bill Pearson

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net