James O. McInerney PhD
j.mcinerney at nhm.ac.uk
Wed Jul 10 07:02:43 EST 1996
I'm putting together a piece of software for LogDet distance measures.
In the 2nd edition of Molecular Systematics (Hillis et. al., Eds.
1996), there is the suggestion that LogDet distances have a problem
when rate variation exists across sites. As a suggestion the authors
recommend removing some of the invariant sites (the exact number to be
decided from multiple runs of a maximum likelihood program and choosing
the dataset that gives the highest likelihood). The theory behind this
approach is to have a dataset where all sites are evolving at
approximately the same rate.
My question: Say you remove 10 percent of all invariant sites at the
outset. Closely related taxa will still have an awful lot
of invariant sites in a pairwise comparison, whilst more
distantly related taxa may have very few. Should the
approach be either (a) PAIRWISE removal of a number of
invariant sites or (b) removal of a proportion of
invariant sites at the outset?
In the original papers the datasets were modified at the outset to
include only parsimony-informative sites.
Any suggestions welcome,
James O. McInerney Phone/Voicemail: +44 171 938 9247
Senior Scientific Officer, email:j.mcinerney at nhm.ac.uk
The Natural History Museum,
London SW7 5BD
More information about the Mol-evol