Gonnet et al. (Nomenclature)
gonnet at inf.ethz.ch
Wed Jul 1 02:24:20 EST 1992
In article <920629220811.21a51350 at sds.sdsc.edu> gribskov at SDSC.EDU (Michael Gribskov) writes:
> Personally, I believe this nomenclature to be extremely unfortunate. The
> term "Dayhoff matrix" is nearly universally used to mean the MDM78
> (mutation data scoring matrix; log-odds matrix for 250 PAMs). To call the
> Gonnet et al. matrix a Dayhoff matrix is to imply that it was derived by
> Dayhoffs methodology. A less connotation loaded term would be log-odds
> matrix. The term PAM-250 matrix has also been used virtually as a synonym
> for the MDM78 matrix.
There are three type of matrices: (names as in Dayhoff et al.)
"Number of accepted point mutations", which is the original collection
of data points. (Nobody uses this much further)
"Mutation probability matrix", where entry Mij describes the probability
of amino acid "j" mutating into aminoacid "i" in 1 PAM-unit of evolution.
This is what we (and I believe everybody else) calls mutation matrix, and
is really a transposed Markov matrix. When M is powered to some value,
say 250, we obtain the mutation matrix (same definition as above) for an
evolution of 250 PAM-units. This is statistics 101.
"Log-odds matrix", a matrix of logarithms of quotients of probabilities,
suitable for running the dynamic programming algorithm (or NW) to align
sequences. We (and lots of other people) are naming this type of matrix
"Dayhoff matrix" as a generic term, in honor of Margaret Dayhoff, given
that she (they) were the first ones to derive such a beast.
Nobody who understands what is going on, uses mutation matrices with
dynamic programming, mutation matrices are used to compute log-odds
matrices or "Dayhoff matrices".
I believe we are all agreeing, it is just a problem of using the
right names. And I find that the literature has been a bit careless
Gaston H. Gonnet, Informatik, ETH, Zurich.
More information about the Bio-soft