In article <322855FA.3394 at is.dal.ca>, aroger at is.dal.ca says...
>I figure that if he went back to the 57 proteins, attempted to remove
>invariable sites, then estimated the gamma distribution shape parameter
>for each dataset individually and then applied the gamma distance correction
>for each dataset that he might have ended up with a different answer. It
>seems clear to me that the value he would have got would be an underestimate
>because of the rate variation problem...does anyone else agree?
>
My colleagues Kent Holsinger, Chris Simon, Lorraine Olendzenski, Elena Hilario
and I did what Andrew suggested in his post. Depending on the set aligned
sequences and on the algorithm used the shape parameter we calculated was in
the range from 0.57 and 1.2. A model assuming an invariant class of sites and
a gamma distribution of rates among remaining sites gave a significantly
better fit to the data and gave estimates of the shape parameter ranging from
2.42 to 3.29 with between 18% and 21% of the sites invariant. Correcting the
data Doolittle et al. present according to these estimates would date the
prokaryote/eukaryote split between 3.5 and 6 billion years ago. These
estimates rely on a constant substitution rate over time and thus probably
severely overestimate the divergence time. This only goes to show that if one
incorporates site to site variation into the distance estimation, one obtains
far earlier estimates for the pro-/eukaryote split and the last common
ancestor. Another problem with many of the gene families used by Russ
Doolittle is horizontal gene transfer. We wrote this up for a so far
unpublished letter to Science, if anyone wants an e-mail copy, let me know.
Peter Gogarten
********************************************
J. Peter Gogarten
University of Connecticut
Dept. Molecular and Cell Biology
75 North Eagleville Rd.
Storrs, CT 06269-3044
USA
Phone: USA 860 486 4061
FAX: USA 860 486 1784
E-Mail: gogarten at uconnvm.uconn.edu
**********************************************