Homology/similarity/identity: proper usage.

David Steffen steffen at mbir.bcm.tmc.edu
Wed Jan 30 13:56:01 EST 1991


  I am again struggling with the proper use of the words "homology",
"similarity", and "identity" in comparing sequences.  Specifically, we
have cloned and sequenced (a bit of) the rat homologue of the _lck_
gene.  The sequence of the mouse and human _lck_ genes is known.  How
do we know what we have is the rat homologue?  Because when we compare our
sequence to the published sequence, most of the nucleotides can be made
to match up with minimal futzing.  So how do I say that?  At present,
we are saying:
 "In all four cases, the inserts were found to contain sequences
  homologous to human and mouse lck..."
but one of my grad students points out that the word homologous is
incorrect, since it represents an inference about evolution rather
than a statement of fact.  My objection to replacing the word
"homologous" with the word "similar" is that is gives the impression
that the sequences don't match all that well.  My objection to
replacing the word "homologous" with the word "identical" is that the
sequences are not identical.  My objection to replacing the word
"homologous" with the words "##% identical" is that I would need four
different numbers for the four different tumors, making the sentence
practically unreadable.

  I guess if "similar" is the only correct word in this context, I
could live with that.  However, since I believe that we are dealing
with homologous sequences, is the word "homologous" really incorrect?
(I understand that "##% homologous" is always wrong; sequences are
either homologous or they are not.)

  Email me if you wish, but I suspect others may wonder about this as
well and that a discussion might be a "good thing".

-- 
David Steffen
Department of Cell Biology, Baylor College of Medicine, Houston TX 77030
Telephone = (713) 798-6655, FAX = (713) 790-0545
Internet = steffen at mbir.bcm.tmc.edu



More information about the Mol-evol mailing list