jacknifing sites and estimating protein site-rate
aroger at is.dal.ca
Wed Oct 4 12:28:24 EST 2000
> > 1) does anyone have/know of a script or small program that allows one to
> > site-jacknife by randomly removing a user-defined number (or percentage)
> > of positions in one's alignment. Preferably something that can handle
> > phylip sequential format?
> Does PHYLIP-Seqboot not do this?
Yes Seqboot does this, but the last time I checked it does a mandatory 50%
site removal. What I want is to be able to vary the number of sites that I
can remove by percentage or simple numbers. Trivial but frustrating for a
non-programmer like me!!!
> > 2) does anyone know of a program that will do an ML estimate of
> > site-rates in a amino acid alignment, given a tree and a transition
> > probability matrix (e.g. similar to Olsen's DNArates, but for proteins)?
> Does puzzle do this? It will pump out numbers corresponding to sites
> and their categories according to a discrete-gamma distribution. could
> be wrong, but I seem to remember this.
Not exactly. Actually I am using this option of puzzle currently to assign
the site rates. However, what puzzle generates is actually a "most
probable" assignment. Somehow it goes through and asks which rate
contributes the most to the likelihood at that particular site and spits
it out. This is obviously different than estimating the rate at a site as
an actual parameter of the model (and using that assignment to calculate
the likelihood). I want the latter. Gary Olsen implemented this in
DNArates for DNA, but hasn't done it for proteins yet. In most cases it
will be useless for tree inference because the number of parameters
estimated = number of sites. But I think if you are only after the site
rates (under an assumed tree), then it should be more accurate than the
puzzle "most probable rate category" assignments.
Sorry for the long-windedness...
More information about the Mol-evol