Squares and Sticks
LYONSW at UCONNVM.UCONN.EDU
Fri Jun 17 11:50:11 EST 1994
* Charles T. Faulkner *
* Univ of Tennessee, Knoxville *
* (ctfaulkn at utkvx.utk.edu) *
>I don't believe a name change is necessary, but I wonder why a statistical
>puzzle has to take the form of match sticks, tooth picks or something other
>than biological entities. Is the field so diverse that we can not ask "what
>is the probability that 3 (or more ) organisims from these localities will
>share the same attribute (morphological, behavioral etc.).
You may ask any quesiton you wish. That is NOT the question I posed.
Here's the question in with relevant biological terms included:
I've got a real problem, and I hope ot tap into the community brain power.
I'ma graduate student interested in furthering attempts in the field of
phylogenetics and systematics to derive a believable p-value to attach
to a phylogenetic hypothesis (a.k.a., tree).
Suppose I have 25 taxa in my study of 100 characters. If I consider
each character state which is identical in any given pair as potential
(or apparent) synapomorphy, I can build a matrix of scores of apparent
synapomorphy. If I consider each pair of taxa in all possible and
relevant (i.e., informative) three taxon statements, I can build
a matrix of RELATIVE APPARENT SYNAPOMORPHY. Any time a pair of taxa
shares a character state to the exclusion of a thirs taxon (any third
taxon), the RAS score for that pair for that character is increased by
one. If I sum all RAS scores for all characters, I have a total score, or
SIGMA RAS, which could be applied in a number of ways (one of which is
certainty estimation; another may be phylogeny estimation).
My question is: one can know the following:
(A,B),D and (A,C),D
for a given character. However, given this, one can infer
(B,C),D on the basis of that same character. One could make the
argument that there are, therefore, fewer degrees of freedom in the
RAS matrix than can be actually calculated, and that any manipulation
is more suspect than it would be if we knew which RAS scores for which taxa
were independent and which were not.
Since I hope that I'm more of a biologist than a statistician, I
hope that there are those out there who recognize the conundrum
I have; the observation that two taxa share a character state is
sometimes less surpising than it should be, and sometimes more
surprising than it should be. Treating each observation of apparent
synapomorphy as independent could cause an underestimated error term.
But unless we know the actually history of lineage splitting, we cannot
tell when we should be surprised and we we should not be surprised!
caveat: by "we" I mean those with interest in the problem
More information about the Bioforum