I have a dataset of 12 taxa and ~4,000-odd bases that I am trying to
analyze. Specifically, I am trying to test the monophyly of a sub-group of
taxa on the tree. ModelTest selects GTR+I+G as the best model for my data
under Maximum-Likelihood. I want to do the Log-Likelihood Ratio test on this
hypothesis (monophyly of a sub-group within the tree). To test the
significance I thought I would do parametric bootstrapping. Can anyone
please show me how I go about this, step by step?
I am thinking I would go about it as follows:
1. Generate simulated data sets using Seq-Gen (how many are appropriate?) Do
I generate two groups of datasets - one with the best tree and the other
with the best tree with the sub-group constrained as monophyletic?)
2. I guess I would then have to compute the likelihood scores for each of
the simulated datasets. Assuming I did a 100, is there any automated way of
doing this? Or do I have to open each dataset in PAUP*, load the appropriate
tree, calculate likelihood scores and append them to a file?
3. What do I do next? Do I make a matrix doing substractions of every
combination of likelihood values from the best tree set with those from the
constrained tree set? Is there any program to do this? Is this what
generates the distribution of delta values to which I would compare the
delta obtained from the real data set?
4. Am I on the right track?!
Has anyone out there done parametric bootstrapping to test monophyly by the
method of Huelsenbeck et al (1996, 1997)? Can you share the procedural
details of how you did this?
Thanks in anticipation, to all who reply.
Vijay Aswani, Ph.D.
Smithsonian Tropical Research Institute,
APO AA 34002-0948
Tel (in Panama): 507-212-8824 (work), 236-3243 (home)
Email: vaswani at sinfo.net or aswaniv at naos.si.edu