IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Status of Arabidopsis Genome Annotation

Sue Rhee rhee at acoma.Stanford.EDU
Thu Jan 25 20:37:32 EST 2001

Dear Arabidopsis researchers,

Since the publication of the Arabidopsis Genome Sequencing paper, there
have been a lot of questions sent to us with respect to the status of the
annotation and its accessibility. The following questions summarize the
types of issues that have been voiced to us. If you have any other
questions or concerns that are not addressed below, please do not hesitate
to contact us at: curator at arabidopsis.org

1. Does TAIR have all of the recently completed A. thaliana sequence data?

The recently completed sequence and annotation data are available from
TIGR's FTP directory at:


The same data files are also available in TAIR's FTP directory:


Please note that TIGR has recently assigned the standard ORF names
(chromosomal loci) based on non-redundant overlapping of the BAC clone
sequences. The new assignments have not been tested rigorously and we
are currently in the process of doing that.

Data as it appears on the genome release in Nature (Arabidopsis Genome
Initiative, 2000, 408:796-815) are available from MIPS at:


TAIR is currently loading the TIGR data into its database to allow
searching. We expect to have this information by the end of
February. We are also currently working on putting the datasets on the
BLAST server. This will be in place in the next couple of weeks.

2. What are TAIR's policies for maintaining concurrency (both sequence data
and annotations) with the originating sites of genomic data?

Updates of sequence and annotations will be done from parsing GenBank and
TIGR. TIGR is funded to reannotate the entire genome in year 2001. The
reannoated data from TIGR will be available from TAIR and TIGR. Both
sources and dates of the different annotations will be available from

The data files in TAIR's ftp directory will be updated whenever there
is an update in TIGR's ftp directory. TAIR will maintain one or two
previous versions of the files. Data updates in TAIR database will
occur both on a regular basis for bulk updates and frequent updates
for small changes. Currently TAIR and TIGR are collaborating closely
to make sure that data synchronization will happen consistently and

3. How does TAIR plan on keeping current as originating sites add
additional or update the data and annotations?

If the originating sites make the updates to GenBank's Nucleotide
database, these will be parsed into TAIR and added to the
annotations. All previous annotations will also be available.

4. How will TAIR denote changes to the data or annotations?

Always, the source and date of the data/annotations will be provided.
Updates to the annotations will be notified via email once a week as
done currently for GenBank updates. In addition, we will have a
time-dependent parameter on our search interfaces.

5. When will this data become available through NCBI?

The reannotated data will be available through NCBI's genome database
(NOT Nucleotide database) as soon as the standard ORF names
(chromosomal loci) designated by TIGR stablizes and TAIR has all the
current and previous annotations available from its database. We
expect this to be done by the end of February.

6. As data at TAIR is updated will it be sent to NCBI?

Yes. Currently the mechanism is being discussed between TAIR,

Sue Rhee
The Arabidopsis Information Resource (TAIR)

Owen White
Director of Bioinformatics
The Institute for Genome Research (TIGR)

Sue Rhee                         	rhee at acoma.stanford.edu
The Arabidopsis Information Resource	URL: www.arabidopsis.org
Carnegie Institution of Washington	FAX: +1-650-325-6857
Department of Plant Biology		Tel: +1-650-325-1521 ext. 251
260 Panama St.
Stanford, CA 94305


More information about the Arab-gen mailing list

Send comments to us at biosci-help [At] net.bio.net