Chocolate genes are good for you !
A recent plant gene set comparison puts Cacao tree genes of the Mars/USDA project,
built with EvidentialGene methods at top for gene-set completeness.
The summary of gene set completeness for plant orthologs for cacao suggests
that the arabidopsis gene set (here the TAIR 10 version), can be improved.
A forecoming killifish gene set using these Evigene methods is at top of
10 fish gene sets for completeness, above gene sets from NCBI and Ensembl
gene modelling methods.
Methods matter: there is another, independent cacao gene set that isn't as complete,
likewise with two sweet orange gene sets. These have to do with how much gene evidence
is used, and how it is turned into complete gene models.
EvidentialGene project finds that Gene-omes constructed from mRNA-seq assembly overtake
genome gene-predictions. An existing dogma in genome projects, that quality of a gene set
is dependent on the quality of the genome assembly, is no longer accurate. mRNA-seq
assembly now does as well or better than genome-gene modelling. Both together, with methods
that emphasize mRNA-seq assembly and address genome-assembly and prediction errors, do the best.
Learn more of this at http://arthropods.eugenes.org/EvidentialGene/
with a summary of methods and results in this poster
The Evigene mRNA-assembly pipeline software is working well enough in
other people's hands now, and they are finding same sorts of results:
-- Don Gilbert