Hi Mark,
Regarding EST datasets, transcriptome projects based on conventional
Sanger-based EST sequencing aren't required to be in the Genome Projects
database. I thought all SRA-based projects were required to be in Genome
Projects, but upon further checking it appears that is also optional.
Sorry for the confusion.
I'm not sure who maintains the http://www.intlgenome.org/ page, but it
is likely to be quite out of date even for the big centers since the
last species was added in Sept-2008. Maybe there's a suitable wiki page
that could be used to maintain a user-editable table of ongoing
projects?
For those of you interested in what Arthropod transcriptome datasets are
currently in the NCBI SRA database, you can use this NCBI Entrez query:
arthropoda[orgn] AND "biomol transcript"[Properties]
http://tinyurl.com/yhkc3dw
10 species currently have transcript data in SRA:
Zygaena filipendulae, species, moths
Heliconius melpomene malleti, subspecies, butterflies
Heliconius melpomene cythera, subspecies, butterflies
Microctonus aethiopoides, species, wasps &c.
Melitaea cinxia (Glanville fritillary), species, butterflies
Anopheles stephensi (Asian malaria mosquito), species, flies
Glossina morsitans, species, flies
Drosophila melanogaster (fruit fly), species, flies
Manduca sexta (tobacco hornworm), species, moths
Locusta migratoria (migratory locust), species, grasshoppers
-Terence
-----Original Message-----
From: Mark Blaxter [mailto:mark.blaxter from ed.ac.uk]
Sent: Wednesday, December 09, 2009 4:29 PM
To: Murphy, Terence (NIH/NLM/NCBI) [C]
Cc: arthropod from magpie.bio.indiana.edu
Subject: Re: [Arthropod] Arthropod genomes in progress?
Hi Terence
thanks for the info
A couple of comments and questions
- I have submitted many EST datasets in the past (standard Sanger
ones) and havent been 'required' by NCBI dbEST to 'register' the
'genome project' they derive from. Is this a new rule, or is it an
aspiration?
- We have Illumina, Roche and AB instruments, and our user base is
requesting 'whole' transcriptome and genome data generation ever more
frequently, so it will be good to get a registry going. However,
waiting till the data are submitted to GenBank/EMBL/DDBJ may not be
what the community needs - I think I'd like to have a registry of
genomes-in-progress and genomes-in-aspiration, so we can collaborate
in data generation (someone might be doing a genome for the
transcriptome I am generating, etc), data analysis (someone might be
interested in the clade my genome is from) and thus copublication/web
presentation. Is that what you meant Don?
- a question... the 'big' (or maybe thats 'vast' now!) genome centres
are part of a global collaborative, and thus have their genomes-in-
progress and genomes-in-waiting fasttracked to NCBI and other www
sites. Ive tried to contact them via http://www.intlgenome.org/ but
had no reply. Now that we are able to generate >10 Gbase/week of raw
data in our centre alone, it would be good to open the club a little,
no?
Mark