IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

phylogenetic autocorrelation

Alan R. Rogers rogers at ARSUN.UTAH.EDU
Fri May 25 18:22:59 EST 1990

I've been trying to figure out the Cheverud-Dow-Leutenegger (CDL) method of
phylogenetic autocorrelation, and am stuck.  I am posting my confusion in
the hope of generating a dialogue about (1) whether the problem I perceive is
real or not (2) how well the method works.

The aim of the method is to estimate the correlation, over several species,
of two characters.  This is not a trivial problem because the data points are
not independent: closely related species tend to be more similar than random
pairs of species.  CDL develop their method in two stages, the first of
which estimates the "phylogenetic constraint" involved in each character,
and the second of which attempts to estimate the correlation between what
are called the "specific components" of each character.  It all boils down
to a path analysis problem which, in simplified form, looks something like
        /-  S1 -------\
        |               X1
    /---+-  P2 -------/
    | rS|        q1
    |   |   
    |   |        s2
  rP|   \-  S2 -------\
    |                   X2
    \-----  P2 -------/

Here, Si and Pi represent the "phylogenetic" and "specific" components of Xi,
and si and pi are the corresponding path coefficients.  rS and rP are the
correlations between specific and phylogenetic components, respectively.

The equations implied by this path diagram are

         r = s1 s2 rS + q1 q2 rP
         1 = s1^2 + q1^2
         1 = s2^2 + q2^2

where r is the correlation between X1 and X2.  CDL's autocorrelation method
gives us the values of s1 and s2, and the 2nd and 3rd equations then tell us
q1 and q2.  This leaves 1 equation in two unknowns (rS and rP), which would
appear to have no unique solution.  Can anyone suggest where we might find
an additional equation that might give this system a sensible answer?

CDL, by the way, study systems with more than two variables.  The problem
would appear to be even worse there, since each new variable adds more
parameters than it adds observable correlations.  For example, adding one
more variable to a system of K variables would add K observable correlations
(one with each of the original variables), but 2K unobservable parameters
that must be estimates (K correlations between specific values, and K
between phylogenetic values).


Alan Rogers
 INTERNET: rogers at arsun.utah.edu
 USMAIL  : Dept. of Anthropology, Univ. of Utah, S.L.C., UT 84112
 PHONE   : (801) 581-5529

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net