Question Regarding Bootstrap
A Mitchell
am16 at gpu.srv.ualberta.ca
Wed Oct 20 16:31:22 EST 1999
James,
Yes I understand your point about variance, but there is a paper in the
literature (Bull et al.?) suggesting that 500 bootstrap replicates is
sufficient for most purposes. One million just seems way out there...
Andrew
On Wed, 20 Oct 1999, James McInerney wrote:
> Dr. Mitchell,
> The reason I suggested performing a million bootstraps is that the more
> bootstraps you perform, the closer you will get to the actual bootstrap
> value. If the support for an internal node is close to either zero or
> 100 percent, then the variance on this estimate is much lower than if it
> is close to 50% (in other words, it asymptotes around 50%).
> If you perform only 5 bootstrap replicates, then you will only get a
> very poor estimate of the bootstrap support for an internal node. If
> you perform 100 bootstraps, then your estimate is better. You could
> empirically determine the variance for each node, by performing
> successive 100-replicate runs and looking at the difference in support
> values for each run.
> However, by performing one million runs, you have a very _accurate_
> estimate of the bootstrap support for any given node. The variance will
> be very small.
>
> For my given case, you could concievably perform 100 runs on the full
> dataset and the dataset that only had parsimony-informative sites and
> get the same values for each run (100% support). However, by performing
> 1,000,000 replicates, you are very unlikely to get this result, unless,
> of course there is NO difference between the two answers.
> Hope this makes sense.
>
> James
> A Mitchell wrote:
> > James McInerney <james.o.mcinerney at may.ie> wrote:
> >
> >
> > > you should do about one million replicates to get a reasonably accurate
> > > estimate of the b'strap value
> > James,
> >
> > Please explain this one!
