Thu Apr 18 18:48:00 EST 1991

RE: Database for Arabidopsis.

We have a relational database for the E. coli Genetic Stock Center that
describes Strains, Genes, Mutations, Pedigrees, Map, Gene Products,
Chromosome Segments, etc., etc., with References and Persons of course
associated with objects of virtually any type, and we are examining
the possibility of adapting the design to plants, in conjunction with
the USDA Plant Genome Project, and somewhat more specifically the
Maize DB Project.  It is in the Sybase Relational Database Management
System and we have used their AptForms system and their version of SQL
to develop and use a form for each major object type that serves to
enter, query, and update information in the database.  We query for
combinations of alleles or any combination of traits or descriptors of
the strain or other object, or by position (e.g., deletions or insertions
or genes or whatever within a given region), or by ancestry (e.g., isogenic
lines or all strains descendent from X).  We have heavily used,and developed,
i.e., expanded on, a concept of extensible controlled vocabulary for
describing such things as mutation type (deletion, inversion, base pair
substitution, insertions, etc.), mutation property (temperature sensitive,
amber, ochre, etc.) , SiteType (gene, operon, chromosomal fragment,
sequenced region....) and there would be a real advantage to standardizing
this vocabulary across species.  So I'm very interested in talking to
others with databases in mind. Plant Genome Office has given funds to NSF
for Arabidopsis database work, and it would certainly be good to coordinate
all the efforts at some level.  I'll be glad to provide you or others
with a description of our fields, tables, forms (and advanced query capability
and analytic tools, if you're interested), but not on this wire, since there
are over a hundred tables with 2-12 fields per table.
Mary Berlyn

