Gap penalties, PAM matrices and so on
Davison at UH.EDU
Fri Jun 26 16:13:45 EST 1992
Mark Cohen said:
> >- "mutation matrices ... differ, depending on whether they were derived
> > from protein pairs that are distantly homologous or from protein pairs
> > that are closely homologous". What a discovery !!
> >- how can anyone align confidently protein sequences that are "distantly
> > homologous" and use the results to build a matrix ?
> We did not align "distantly homologous" and build a matrix from the results
> We aligned all the proteins in the data base with all the others.
OK, but the question still stands, how does one confidently align
those segments that are "distantly homologous" (your term)?
> Where the scores obtained (using Dayhoff's 1978 matrix) indicated
> that the alignments were significant (ie that the probability of the
> alignment was significantly higher than alignment of two random
> sequences) these alignments were used in the construction of the matrix.
So, then, "distant" means "significant"?
> >- what are "distantly homologous" proteins ? [...]
> [Tautology deleted] Proteins for which the alignment score is high
> enough above the score of aligned random sequences yet not so high
> as to be unambiguously related.
If only that had been stated in the paper...
> >- what is the influence of the enormous redundancy found in protein
> > databanks (hundreds of cytochromes, thousands of histones, zillions of
> > globulins, ...)
> We will in future publish the matrices calculated with, without and only for
> the immunoglobulins. The results do not change our opinion significantly.
Ditto. Just the single sentence would have helped.
> >- the explanation for the -3/2 power concerning the probability of
> > a gap is a joke ?
> The k^-3/2 term is an experimental result. The probability of the
> two ends of a chain being close in space is dependant on the length
> of the chain as described in the paper, or you can read Flory's book
> on polymers.
I can't say where I heard this (perhaps they will post) but others
have found this result also.
> > Well, I prefer to stop here. May I draw your attention on the paper
> >by Jones, Taylor and Thornton in the last CABIOS issue ? Their aim was also
> >to build an updated Dayhoff matrix. They did it, with the difference that
> >their procedure is crystal clear. And that, by necessity, their matrix was
> >not built with "distantly homologous proteins".
> Jones et al found like us that the differences between the Dayhoff
> 1978 matrix and the recalculated matrix were largest for the least
> common amino acid pairs, eg W-Y or W-C etc. Their paper is somewhat
> longer than ours hence their more detailed explanation.
I don't think so, I think it was just a matter of presention. Just a
few sentences in Dr. Cohen's post have cleared a number of
dr. dan davison/dept. of biochemical and biophysical sciences/univ. of
Houston/4800 Calhoun/Houston,TX 77204-5934/davison at uh.edu/DAVISON at UHOU
-----RIP Isaac Asimov 1920-1992 I'll miss him --------------------
Disclaimer: As always, I speak only for myself, and, usually, only to
More information about the Bio-soft