Database of KNOWN Arab CDS

Curtis J Palm cpalm at
Thu Apr 20 14:30:02 EST 2000

P.G. Korning, S.M. Hebsgaard, P. Rouze and S. Brunak made a database
called Araclean for use in training their gene predictor.  they made sure
all the sequences in the database are correct and "real".

Cleaning the GenBank Arabidopsis thaliana data set,
  P.G. Korning, S.M. Hebsgaard, P. Rouze and S. Brunak,
  Nucl. Acids Res., 24, 316-320, 1996.
it is on the web at

but it is not very current.

Curt Palm

On 19 Apr 2000, Paul Shinn wrote:

>       I'm trying to assemble a dataset of KNOWN coding sequence for Arab
>  proteins.  I need it for a class project.  I've pulled out all the ESTs
>  from Genbank for Arab but I am unsure of the quality of the sequence and
>  many of it has Ns in it.  Much of the sequence in the nrdb is for
>  putative/similar/hypothetical proteins.  Is there somewhere I can download
>  this kind of dataset?
>  						Thanks, Paul
>  ---
>  Paul Shinn
>  Sequencing Coordinator                                    ,___o
>  pshinn at                            _-\_<,
>  Arabidopsis thaliana Genome Center                      (*)/'(*)
>  (215) 573-7256

More information about the Arab-gen mailing list