Arabidopsis Whole Genome Annotation Release
Town, Christopher ChrisD.
cdtown at tigr.org
Mon Sep 10 20:21:47 EST 2001
The informatics group at TIGR, led by Owen White, is pleased to announce the
completion of the first phase of their NSF-funded effort to provide a
uniform, high quality annotation of the entire Arabidopsis genome. This
information is now accessible through the genomes division of GenBank.
In this release, the individual BACs, YACs and other sequences that make up
the tiling paths for each chromosome have been merged into re-constructed
pseudomolecules in which the sequence alignments in every overlap have been
examined. Within each overlapping region, gene models were scrutinized and
wherever possible discrepancies were resolved to produce only one model for
any region of the genome. Coordinates of all gene models were transferred
from the annotated BACs onto the pseudomolecules and in collaboration with
MIPS, genes were assigned unique chromosome-based identifiers that they will
retain in perpetuity, along with their original BAC-based names. Overlap
regions that contain sequence discrepancies have been flagged and are being
re-examined. Wherever possible these discrepancies will be resolved either
in silico or by experimentation, and any affected gene models repaired.
Ongoing efforts include refinement of existing gene models throughout the
genome by a number of methods, including use of full-length cDNAs and
comparisons with paralogous family members. In parallel we will be applying
a uniform set of names to all the genes and wherever possible making Gene
Ontology (GO) assignments. This effort will continue for the next year in a
collaboration between TIGR, TAIR NCBI and MIPS to provide a continually
improving series of public releases of the annotated Arabidopsis genome.
More information about the Arab-gen