Little databases

Andrew John Millar ajm2m at FARADAY.CLAS.VIRGINIA.EDU
Mon May 17 09:06:16 EST 1993

AIMS and Aatdb will never hold more than a tiny proportion of the
data produced by the labs working on Arabidopsis. Much of this
data will never leave the lab or even the individual who
generated it, yet almost every project depends on such
information: the unpublished replicates, pedigrees and lab
records. The sheer volume of this data should place it on the
agenda of the upcoming database workshop.

The labs engaged in large-scale mapping or sequence acquisition,
for example, are already running "focused" databases to manage
their primary data. Judging by some mailings of 1991, some of
these labs had coordinated their efforts to facilitate data
exchange. The majority of projects will not require such
high-volume data handling, so the databases currently in use are
likely to be widely varied among labs, which have either
developed their own or adapted other programs (cf. Danny
Chamovitz' note: more requests for programs than programs on
offer). The structure of our primary data, however, is
independent of its context, in many cases. Any mutant isolation
project, for example, will need to organize phenotypic data for
dozens of candidate lines over many generations, including
back-crosses and complementation tests. Most phenotypes could
likely be represented by similar numbers (and types) of data
fields, so a standard database with a modicum of flexibility
could accommodate the needs of many labs.

I suggest that the development of a standard database for primary
data will be worthwhile for the community. The database would be
used within a lab, or even within a project, to organize data as
it is generated (possibly with configurations for different data
types). This will save individual labs the time required to write
or modify a program of their own, though this is far from being
the only advantage. Central development should foster greater
capabilities in a database than time and expertise allow for
home-grown systems: still video, for example, might be used to
store images of plants in the database. Coordination with
Aatdb/AIMS organizers should streamline both submission of lab
data to the archives and also downloading of archived data for
use in the lab.

The focused database(s) would complement the central, archival
databases but have entirely different goals. Does anyone else on
the network feel that a coordinated effort in this direction
could yield a tool(s) of general utility? If so, for what types
of data?

Andrew Millar

More information about the Arab-gen mailing list