Three Arabidopsis FASTA files with annotation incorporating MPSS
expression data are available at our site:
The files are:
x-ath-ncbi-mpss-unspl.fasta containing unspliced genes with 500
nucleotides preceding start codon and 500 nucleotides following stop
codon. Exons are shown in capital letters.
x-ath-ncbi-mpss-prot.fasta contains translated to protein spliced exons.
x-ath-ncbi-mpss-cds.fasta contains all extracted CDS from start codon to
stop codon without introns.
Arabidopsis MPSS data for five tissues: callus, flower, leaf, root and
silique were incorporated into the annotation of Arabidopsis genes. The
data for unique detected signatures from the public Arabidopsis MPSS
project (University of Delaware, Blake Meyers) were appended to the
annotation. An example of the resulting annotation is pasted here:
>At1g01010 No apical meristem (NAM) protein family [ T25K16.1
NP_171609.1 15223276 CDS forward ] [MPSS: callus 195 tpm, flower 63
tpm, leaves 18 tpm, root 33 tpm, silique 2 tpm (CFLRS)]
Please refer to the scheme explaining the annotation structure on our
The primary reason for producing these files was their easy utilization
in our genomics data analysis program, PyMood ( http://allometra.com ),
for sophisticated querying of BLAST output data.
Please note that these files contain TIGR Arabidopsis annotation version
4 (not the latest one), and include information about one (longest)
splice variant for every gene only.
If you have any questions or comments, send them to me at
marta at allometra.com
Davis, CA 95616
marta at allometra.comhttp://allometra.com