CONTRIBUTE YOUR UNPUBLISHED cDNA DATA FOR ASSEMBLY, WITHOUT COMPROMISING
We are getting ready for a new assembly of EST data. This time, we will
make use of all the available cDNA data, including ESTs from other
projects/strains, Genbank entries etc... This will be made possible by
the availability of the draft genome sequence, which we will use to
drive the assembly and correct mistakes/polymorphisms.
Some of you may have cDNA data that could be invaluable to our project,
but have not yet been deposited in Genbank for one reason or another. If
you send them to us, they may help us produce a more accurate and more
useful representation of your favorite genes, both in the EST database
and on the genome browser.
So please do not hesitate, send them as a text file to Chuck Hauser
(chauser at duke.edu). They can be raw sequencing reads, or sequences
assembled from multiple reads, as long as they are cDNA data. They do
not need to be high quality.
The requested format for each sequence is FASTA. The header line should
be < 80 characters:
>an identifier (at least 8 alphanumerical characters, to avoid
confusion)_a short description (optional) followed by a return and your
By default, these sequences will be considered as reading 5'>>3' with
respect to the gene. If the sequence reads 3'>>5', please append ".x1"
at the end of the header line.
These sequences will be incorporated into our assembly process, but will
NOT be made publicly available. The identification line may -or may not-
include your name, depending on what level of confidentiality you wish
to keep. In the list of sequences used to generate a contig, only the
information line will appear, not the sequence itself. Users that would
be interested in learning about the origin of the sequence should E-mail
us, we will simply forward their request to you.
We are planning to start assembly at the end of September, so please
send your sequences before September 28, 2003.