I am involved with the GeneMark program which finds genes using
Markov statistical models based upon previously generated matrices.
Unfortunately, the models are progressively worse the further one gets
from the organism for which they were generated.
I am going to try to generate some accuracy vs evolutionary distance info
and could use some pointers to the pre-graph data used for various trees.
I need to cover as many different organisms as possible.