[Fwd: USDA animal bioinformatics workshop]

Ed Coe coee at missouri.edu
Fri Nov 8 16:40:57 EST 2002


MaizeNetters:
Dave Matthews sent the following message to Graingenes.  Recommended
surfing!
Ed

-------- Original Message --------
Subject: USDA animal bioinformatics workshop
Date: Wed, 6 Nov 2002 11:57:23 -0500 (EST)
From: Dave Matthews <matthews at greengenes.cit.cornell.edu>
To: Grains mailgroup <grains at greengenes.cit.cornell.edu>

The USDA is holding an online public workshop to plan the future of 
animal genome databases.  Many of the issues being discussed are 
equally relevant to plant bioinformatics, so you might be interested 
in looking at it.  A sample posting from this morning is below. 
Currently there's a panel discussion going on, then on Nov 11 it will 
be opened up for public comment.  The URL is 
http://genome.cvm.umn.edu/bioinfo/

- Dave

====================================================

   Date: Wed, 06 Nov 2002 08:00:30 -0600
   From: David Adelson <david.adelson at tamu.edu>
   To: <bioinfo-panelists at mail.ahc.umn.edu>
   Subject: [Bioinfo-panelists] Database, what kind?

Dear All,

We have been asked to discuss animal genome databases, so let's get
started. I will go out on a limb and make a few assertions, some of 
which are probably obvious, some of which are certainly wrong. Most 
of these points are directed at where we need to go, not necessarily 
at where we are or should be now.

1. A relational database is a given, because of fiscal constraints 
either MySQL or PostgreSQL.
2. Since the highest resolution genome map is a genome sequence, 
whatever database format is chosen for current needs has to be able 
to support a sequence database.
3. There are two good models for us to emulate but not duplicate:
NCBI and UCSC. Neither of them is perfect for comparative genomics. The
primary key (sequence accessions) will still be NCBI generated for all species.
4. For a sequence database, comprehensive annotation is the ultimate
goal, where every base pair is a potential annotation site.
5. Map data will ultimately be converted to sequence annotation.
6. Expression data will become annotation.
7. Functional data is obvious annotation, more on that later.
8. Other kinds of annotation include promoters, splice sites,
polymorphism, potential stem/loop structures.
9. There is already a good first attempt at this type of 
architecture, called GMOD (Generic Model Organism Database, 
http://www.gmod.org/ ). This could serve as an off the shelf core 
that could be customized and improved for any species, thus ensuring 
interoperability.
10. Different genome sequence databases will be linked by sequence 
similarity, but more significantly by equivalent structures and 
functions. Right now the focus is mapping and synteny.
11. Back to functional annotation. An ontology is a given, such as 
the one used by the gene ontology consortium. We need to make sure 
that the ontology is based on accepted terms and is evidence based. 
Perhaps most critical, any ontology has to have a dictionary so that 
users can actually use it to formulate queries.
12. How we view the data or how we can choose to view the data is
important.  The UCSC track based viewer is great, but not if you want 
to look at
a whole chromosome or a big chunk of a chromosome. There are a number of
different ways genome coordinate information, functional annotation 
and expression data can be visualized, both for a single genome and 
to compare across genomes. This is a truism, but for a database to be 
useful it has to be user friendly.
13. Curation is a requirement, as is distributed annotation. How to 
reconcile the two is the problem.
14. I have not even begun to think about how population data for 
quantitative traits can be incorporated as annotation for a sequence 
database, but that is clearly a topic that needs to be addressed.

   Enough from me for now.

   Cheers,

   Dave

   --
   David L. Adelson, Ph.D.
   Associate Professor - Animal Genomics
   Dept. of Animal Science
   Dept. of Veterinary Anatomy and Public Health
   Faculty of Genetics

   Postal address:
   Animal Science Dept.
   Texas A&M University
   TAMU - 2471
   College Station,TX 77843-2471
   U.S.A.

   http://animalscience.tamu.edu/ansc/breeding.html




More information about the Maize mailing list