I'd like to put the following two propositions concerning NH treefiles
to the public for discussion:
a) Sometimes you need to draw a tree with edge lengths *and* corresponding bootstrap
values for each internal branch. For example, consider this tree with edge lengths:
(LungfishAu:0.1246,(LungfishSA:0.1338,LungfishAf:0.1324):0.0769,(
(((Platypus:0.1441,Opossum:0.1302):0.0380,((Mouse:0.0566,Rat:0.0717)
:0.1017,(((Cow:0.0803,Whale:0.0935):0.0334,Seal:0.0848):0.0267,Human:0.1514)
:0.0355):0.0495):0.0865,((((Crocodile:0.1948,Bird:0.1546):0.0560,Sphenodon:0.2029)
:0.0239,Lizard:0.2302):0.0362,Turtle:0.1508):0.0581):0.0842,Frog:0.1664)
:0.0619);
and the same tree with bootstraps for it internal branches:
(LungfishAu,(LungfishSA,LungfishAf):100,((((Platypus,Opossum):100,
((Mouse,Rat):100,(((Cow,Whale):100,Seal):100,Human):98):99):100,
((((Crocodile,Bird):100,Sphenodon):92,Lizard):72,Turtle):100):100,
Frog):99);
At the moment, it is not possible to have *both* information in one treefile, and it is
not possible to print let's say a phenogram with bootstraps.
I therefore would like to propose that the current NH convention should be extended
to allow for more than one : extension. The tree drawing programs (DRAWTREE, DRAWGRAM,
TreeView) could then allow the user to specify the task of each entry, say the first
is the edge length, the second entry is the bootstrap value for each (internal) branch.
Another possibility would be that the tree drawing programs allow for the input of 2 treefiles
at the same time to generate 1 picture of a tree. ( Probably this is even a better solution
than allowing for more than one : extension)
b) As another extension of the NH scheme I'd like to propose that treefiles should *always*
look like this:
1
(LungfishAu:0.1246,(LungfishSA:0.1338,LungfishAf:0.1324):0.0769,(
(((Platypus:0.1441,Opossum:0.1302):0.0380,((Mouse:0.0566,Rat:0.0717)
:0.1017,(((Cow:0.0803,Whale:0.0935):0.0334,Seal:0.0848):0.0267,Human:0.1514)
:0.0355):0.0495):0.0865,((((Crocodile:0.1948,Bird:0.1546):0.0560,Sphenodon:0.2029)
:0.0239,Lizard:0.2302):0.0362,Turtle:0.1508):0.0581):0.0842,Frog:0.1664)
:0.0619);
This means that in the first line there should always be a number indicating the number
of trees following. This would have the advantage that input trees and output trees have
the same format (DNAML 4.0!) und you could just rename the corresponding files. It would
also create a compatibility between PHYLIP tree files and MOLPHY topology files.
What do you think about these two propositions?
Korbinian Strimmer