IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

measuring distances by amino acid composition

Athel Cornish-Bowden athel at ir2cbm.cnrs-mrs.fr
Thu Jan 13 08:07:54 EST 2000

Thorsten wrote:
>James McInerney wrote:
>> Thorsten,
>> One of the programs in the molphy package calculates amino acid distance
>> There is a reference ot it on Joe Felsenstein's webpage:
>> http://evolution.genetics.washington.edu/
>Thanks. I have MOLPHY. However, the program PROTST.EXE does not estimate
>distances but plain aa composition. I would like to estimate distances
>based on the aa compositions of the different proteins.
>> I can send you a macintosh version of my program GCUA that will do this,
>> although it needs fasta-formatted protein-coding DNA sequences as input
>> (it converts to proteins and then calculates a distance matrix).
>Unfortunately, I don't have a Mac. UNIX or DOS are always welcome.
Nice to see that anyone is still interested in doing this after all these
years. I long ago gave up trying to convince people that aa compositions
contained useful information. Unfortunately I don't have a program, but
writing one would be trivial (a few minutes work) if you start with one
that can read sequences. What you need to calculate is 0.5*Sum(square(niA -
niB)), where niA is the number of residues of type i in sequence A, niB is
the same in sequence B, and the sum is over all types of residue. If the
two sequences have the same lengths the result is an estimate of the number
of differences between the aligned sequences. If the lengths are
appreciably different the formula is more complicated, but still easily
programmable (see  J. theor. Biol. 76, 369-386 (1979)). This paper
contained an error in the analysis of the statistical properties of the
index defined that was not corrected until a Note Added in Proof on p. 75
of vol. 91 of Methods Enzymol. (1983). Its effect is that earlier
references to 95% confidence actually meant 92.5% confidence.

Athel Cornish-Bowden


Athel Cornish-Bowden

Bioenergetique et Ingenierie des Proteines,
Centre National de la Recherche Scientifique,
31 chemin Joseph-Aiguier, B.P. 71,
13402 Marseille Cedex 20, France (CHANGED 1.1.2000)

athel at ibsm.cnrs-mrs.fr
Phone: + 33 491 16 41 38; fax: + 33 491 16 45 78 (CHANGED)


Now available: Basic Mathematics for Biochemists (2nd edn.)


More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net