Archive-name: acedb-faq
Last-modifed: 11/5/93
Version: 1.5
----------------------------------------------------------------------
Common Questions and Answers about ACEDB.
This document will be posted monthly to the BIOSCI newsgroup
bionet.software.acedb and to USENET conference news.answers.
It is intended to be used as an index to ACEDB databases and to
information about the database software.
The latest version of the ACEDB FAQ should be available via
anonymous ftp at net.bio.net as /pub/BIOSCI/ACEDB/ACEDB.FAQ
and at rtfm.mit.edu as /pub/usenet/news.answers/
bionet.software.acedb/bionet.software.acedb.FAQ and via
electronic mail from mail-server at rtfm.mit.edu [untried --bks].
Curators of ACEDB databases should take note of Question 4 and
keep me apprised of changes.
Errors of commission or omission are unintentional. If I have
forgotten to give you credit please let me know. Please
send comments and corrections to: acedbfaq at s27w007.pswfs.gov
--Bradley K. Sherman
----------------------------------------------------------------------
List of questions in the ACEDB FAQ:
Q0: What is ACEDB?
Q1: What is the current version of ACEDB?
Q2: What {hardware | software} do I need to run ACEDB?
Q3: Where can I get ACEDB?
Q4: What ACEDB databases exist?
Q5: What written documentation exists for ACEDB?
Q6: Where can I find further information about ACEDB?
Q7: How should ACEDB be cited?
Q8: Is ACEDB object-oriented?
Q9: What's all this about Gopher/WAIS/Anonymous ftp/WWW ...
Q10: How can I get on the ACEDB announcements mailing list?
Q411:Who contributed to this document?
----------------------------------------------------------------------
Q0: What is ACEDB?
A0: ACEDB is an acronym for A Caenorhabditis elegans Database. It can
refer to a database and data concerning the nematode C. elegans,
or to the database software alone. This document is concerned
primarily with the latter meaning. ACEDB is being adapted by many
groups to organize molecular biology data about the genomes of
diverse species [see Q4].
ACEDB allows for automatic cross-referencing of items during
loading and allows for hypertextual navigation of the links
using a graphical user interface and mouse. Certain special
purpose graphical displays have been integrated into the
software. These reflect the needs of molecular biologists
in constructing genetic and physical maps of genomes.
ACEDB was written and developed by Richard Durbin (MRC LMB
Cambridge, England) and Jean Thierry-Mieg (CNRS, Montpellier,
France), beginning circa 1990. It is written in the C programming
language and uses the X11 windowing system to provide a platform
independent graphical user interface. The source code is publicly
available [See Q3]. Durbin & Thierry-Mieg continue to develop
the system, with contributions from other groups including
Lawrence Berkeley Laboratory and the European integrated Genome
Project.
A description by Durbin & Thierry-Mieg:
ACEDB does not use an underlying relational database
schema, but a system we wrote ourselves in which data
are stored in objects that belong in classes. This is
nevertheless a general database management system using
caches, session control, and a powerful query language.
Typical objects are clones, genes, alleles, papers,
sequences, etc. Each object is stored as a tree,
following a hierarchical structure for the class (called
the "model"). Maps are derived from data stored in tree
objects, but precomputed and stored as tables for
efficiency. The system of models allows flexibility
and efficiency of storage -missing data are not stored.
A major advantage is that the models can be extended
and refined without invalidating an existing database.
Comments can be added to any node of an object.
Current display modes are:
TREE for text type objects: papers, authors, genes
etc.
GMAP genetic map
PMAP physical map (Sulston contig style)
SEQ DNA sequence - symbolic, features, sequence
and translation
GRID hybridisation patterns for a probe to a clone
grid
BIBLIO bibliography attached to any object display
modules under development:
CMAP whole chromosome physical map plot
GEL agarose gel simulation derived from sequence
----------------------------------------------------------------------
Q1: What is the current version of ACEDB?
A1: 1-10. It was released Summer 1993. The next release will be 2.0.
This refers to the version of the software, not the data. To
be kept informed of new releases see Q10.
----------------------------------------------------------------------
Q2: What {hardware | software} do I need to run ACEDB?
A2: ACEDB currently runs on the following Unix systems, under X11:
Unix:
Any machine running SunOS 4.x
e.g. Sun SPARCstation 1, 1+, 2, IPC, IPX.
SPARCstation 10 under Solaris [Probably all Solaris, then --bks]
DEC DECstation3100, 5100 etc.
DEC Alpha/OSF-1
Silicon Graphics Iris series
PC 386/486 with Linux (public domain Unix)
There exist, or have existed, ports onto Alliant, Hewlett-
Packard, IBM R6000, NeXT, Convex. You may have to contact
the developer responsible for the port to make these real.
MSDOS/Windows/NT:
A port to NT is rumored to be in the works.
Macintosh:
A port to the Macintosh may become available by the end of 1993.
For cost savings, a combination of a high-end Intel platform
with Linux appears very attractive.
----------------------------------------------------------------------
Q3: Where can I get ACEDB?
A3: All the files are available in the following public access
accounts (anonymous ftp sites) accessible via Internet:
lirmm.lirmm.fr (193.49.104.10) genome/acedb
cele.mrc-lmb.cam.ac.uk (131.11.84.1) pub/acedb
ncbi.nlm.nih.gov (130.14.20.1) repository/acedb
A typical session would be:
ftp ncbi.nlm.nih.gov
login: anonymous
password: your email address
cd repository/acedb
binary
ls
get README
get NOTES
get INSTALL
get bin.sparc.1_4.tar.Z
quit
----------------------------------------------------------------------
Q4: What ACEDB databases exist?
A4: [In alphabetic order by Database name --bks]
Database : AAnDB
Species : Aspergillus nidulans
PI : Leland Ellis
Last_update : Sept. 1993
Database : AAtDB
Species : Arabidopsis thaliana
Availability :
Curator : John Morris
Current version: 1-5
Contact : curator at frodo.mgh.harvard.edu
Last_update : Sept. 1993
Database : ACeDB
Species : Caenorhabditis elegans
Availability :
Current version: 1-21
Curator : Jean Thierry-Mieg
Curator : Richard Durbin
Contact : rd at cele.mrc-lmb.cam.ac.uk
Contact : mieg at kaa.cnrs-mop.fr
Last_update : Sept. 1993
Database : ChlamyDB
Species : Chlamydomonas
PI : Elizabeth Harris
Contact : chlamy at acpub.duke.edu
Availability : Still under construction
Last_update : 30 Sept. 1993
Database : EcoDB
Species : E. coli
PI : Staffan Bergh
Contact : staffan at biochem.kth.se
Availability : Still under construction
Last_update : 11 Oct. 1993
Database : Flydb
Species : Drosophila melanogaster
Availability : by request only, via ftp
Curator : Suzanna E. Lewis
Contact : SELewis at lbl.gov
Focus : STS content mapping project summary
PI : Gerald Rubin
PI : Mike Palazzolo
PI : Dan Hartl
PI : Alan Spradling
Last_update : Sept. 1993
Database : GrainGenes
Species : Wheat, barley, oats, relatives
Availability : Gopher greengenes.cit.cornell.edu port 70
Availability : ACEDB version by ftp, on request from the curators
Curator : David E. Matthews
PI : Olin D. Anderson
Contact : matthews at greengenes.cit.cornell.edu
Contact : oandersn at wheat.usda.gov
Last_update : Sept. 1993
Database : Mace
Species : Zea mays L. ssp. mays
Focus : Maize genome
Comment : Mace is the front end for maizedb, a relational
(SYBASE) database. It is updated from maizedb by
software written by Stan Letovsky. Maizedb is
updated daily and will soon be accessible by
public login.
Curator : Ed Coe
Curator : Pat Byrne
Curator : Georgia Davis
Curator : Mary Polacco
Off-Site Curator : Marty Sachs
Off-Site Curator : Christiane Fauron
Off-Site Curator : Carolyn Wetzel
Off-Site Curator : Steve Rodermel
Off-Site Curator/Designer : Stan Letovsky
Off-Site Curator/Designer : Mary Berlyn
Systems Manager : Denis Hancock
PI : Ed Coe
Contact : maizedb at teosinte.agron.missouri.edu
Last_update: 5 October 1993
Database : MycDB
Species : Mycobacterium
PI : Staffan Bergh
PI : Thierry Garnier
Contact : staffan at pasteur.fr
Last_update : Sept. 1993
Database : RiceGenes
Species : Rice (O. sative)
Availability : under development, login at own risk
Curator : Edie Paul
Contact : epaul at nightshade.cit.cornell.edu
Last_update : Sept. 1993
Database : SolGenes
Coverage: Solanaceae - tomato, potato, pepper (eventually)
Availability : Beta ACEDB via login or tar file
Curator : Edie Paul
Contact : epaul at nightshade.cit.cornell.edu
Last_update : Sept. 1993
Database : SoyBase
Species : Soybeans
Curator : Lisa Lorenzen
PI : Randy Shoemaker
Contact : lorenzen at mendel.agron.iastate.edu
Last_update : Sept. 1993
Database : TreeGenes
Species : Forest trees, Pinus taeda
Availability : contact curator
Curator : Bradley K. Sherman
PI : David B. Neale
Contact : Dendrome at s27w007.pswfs.gov
Contact : bks at s27w007.pswfs.gov
Contact : dbn at s27w007.pswfs.gov
Last_update : Sept. 1993
Database : 21Bdb
Species : Homo sapiens
Availability : by request, via ftp, gopher
Curator : Donn F. Davy
Contact : DFDavy at lbl.gov
Contact : aggarwal at genome.lbl.gov
Focus : STS content mapping & sequencing of Human Chromosome 21
PI : Jasper Rine
PI : Michael Palazzolo
PI : Chris Martin
PI : Jan-Fang Cheng
Last_update : Sept. 1993
Database : VoxPop
Species : Populus spp.
Availability : contact curator
Curator : Carl G. Riches
PI : Reinhard F. Stettler
Contact : cgr at poplar1.cfr.washington.edu
Contact : STETTLER at coyote.cfr.washington.edu
Last_update : Sept. 1993
Database : ?
Species : Bovine
PI : Leland Ellis
Last_update : Sept. 1993
Database : ?
Species : Sorghum
PI : Leland Ellis
Last_update : Sept. 1993
Database : ?
PI : Scott Chasalow
Species : Potato
Contact : Scottish Crop Institute, Dundee
Last_update : Sept. 1993
Database : ?
PI : George Murphy
PI : David Flanders
Species : Arabidopsis thaliana
Contact : John Innes Center, Norwich, England
Last_update : Sept. 1993
Database : ?
Species : Homo sapiens
Focus : Physical mapping of human chromosomes 22 and X
Curator : Ian Dunham
Contact : idunham at crc.ac.ukid1 at sanger.ac.uk
PI : Ian Dunham
PI : David Bentley
Last_update : 28 Sep 1993
[Curators: Please submit an entire paragraph in
this format for inclusion or update. --bks]
----------------------------------------------------------------------
Q5: What written documentation exists for ACEDB?
A5: The primary documents are included in the Software
distribution in the wdoc subdirectory:
acedb -- A C. elegans Database: I. Users' Guide.
acedb -- A C. elegans Database: II. Installation Guide.
acedb -- A C. elegans Database: III. Configuration Guide.
Syntactic Definitions for the ACEDB Data Base Manager
--Jean Thierry-Mieg and Richard Durbin (1991-)
You will find other interesting documents in the wdoc subdirectory.
By anonymous ftp from ncbi.nlm.nih.gov (130.14.20.1)
in repository/acedb:
doc.1_9.tar.Z
Cherry, J.M., Cartinhour, S.W., and Goodman, H.M. (1992) AAtDB,
An Arabidopsis thaliana Database. Plant Molecular Biology Reporter
10 (4): 308-309,409-410
Tutorial manual for AAtDB:
Cartinhour, S., Cherry, J.M., and Goodman, H.M. (1992) An
Introduction to ACeDB: For AAtDB, An Arabidopsis thaliana
Database. Massachusetts General Hospital. (Available on
request in printed form from the AAtDB curator).
A description of ACEDB:
Cherry, J.M. and Cartinhour, S.W. (1993) ACEDB, A tool for
biological information. in Automated DNA Sequencing and
Analysis, edited by M. Adams, C. Fields, and C. Venter.
Academic Press (in press). [text is available through
ftp or gopher from weeds.mgh.harvard.edu]
Another description of ACEDB for physical mapping projects:
Dunham, I., Durbin, R., Mieg, J-T & Bentley, D.R. (1993)
Physical mapping projects and ACEDB, in Guide to Human
Genome Computing. Ed. Bishop, M.J. (Academic Press)
(review, in press). [text is available through ftp or
gopher from weeds.mgh.harvard.edu]
----------------------------------------------------------------------
Q6: Where can I find further information about ACEDB?
A6: There is a Usenet/Biosci conference titled bionet.software.acedb.
If you do not have access to the Biosci conferences via a
newsreader (e.g. rn, trn) you can participate in the conference
by electronic mail. To subscribe to the e-mail version of the
conference send email to biosci-server at net.bio.net with no
subject line and only the message
subscribe ACEDB-SOFT
in the body. To unsubscribe send the message
unsubscribe ACEDB-SOFT
to the same address. This is an automated service. Your
e-mail address will be taken from the header of the message
that you send. If you then send mail to acedb at net.bio.net
the mail will be distributed to all subscribers and to
the electronic conference.
Mike Cherry has set up an ACEDB Developer's archive. For
anonymous ftp use the hostname weeds.mgh.harvard.edu and look in
the acedb_dev directory. If you wish to contribute you can put
files in the incoming directory. Send a message to Mike
(cherry at genome.stanford.edu) that you have put something in that
directory then Mike will move it out for general access.
For gopher you can connect to weeds.mgh.harvard.edu
(132.183.190.21) and ...
--> N. FTP Archives for Molecular Biology/
then
--> M. ACEDB Developer's archive/
[N and M are integers which are subject to change.]
The bionet.software. acedb.conference is archived and can be
searched using WAIS. Here is a Gopher-style link to the WAIS
archive. (This is also courtesy of Mike Cherry.):
#
Type=7
Name=ACEDB BioSci Electronic Conference
Path=7/.index/acedb-biosci
Host=genome-gopher.stanford.edu
Port=70
The AAtDB, Soybase, GrainGenes, Mace, and TreeGenes (see Q4)
databases regularly submit data to the Plant Genome Database
at the National Agricultural Library (NAL). Nal makes this
data available using an WWW server (really http) with the
Universal Resource Locator (URL) http://locus.nalusda.gov.
You will also find a selection of models.wrm files (schemata)
for the various databases here. You will want to get a
"mosaic client" to examine this. [This section will be
expanded in the next version. We hope to make this
document available as hypertext on the NAL server --bks]
Other URL's that readers with mosaic clients might want to
examine are:
http://moulon.inra.fr/acedb/acedb.html for C. elegans data
http://moulon.inra.fr/acedb/mycdb.html for Mycobacterium data
For information on how these were created see
http://moulon.inra.fr/acedb_conf_eng.html"http://moulon.inra.fr/acedb_conf.html (en francais)
The Genome Computing Group, Lawrence Berkeley Laboratory
has an anonymous ftp service at machine genome.lbl.gov
(131.243.224.80) which contains:
flydb - LBL's Drosophila Acedb-style database
21bdb - LBL's Human Chromosome 21 Acedb-style database
querdb - LBL's query-language extensions to Acedb
metadata - LBL's compendium of Acedb database schema variants
macace-aatdb-demo.hqx - pre-release Acedb MacIntosh version
There is also a repository of contributed software for
data conversions and the like.
Computer staff for the UC Berkeley Drosophila physical mapping
project the LBL Human Chromosome 21 project, and the LBL plant
genome projects meet regularly to coordinate their ACEDB
extension and development efforts, along with Frank Eeckman,
who is working on the Macintosh version of ACEDB (for further
information, contact jlmccarthy at lbl.gov). They also keep in
close touch (via email, personal visits, etc.) with their
counterparts in Cambridge (Richard Durbin et al), Montpellier
Jean Thierry-Mieg et al), and the Interated Genome Database
project in Heidelburg (Otto Ritter, Detlef Wolf et al).
----------------------------------------------------------------------
Q7: How should ACEDB be cited?
A7: From the distribution:
We realize that we have not yet published any "real" paper on
ACEDB. We consider however that anonymous ftp servers are a
form of publication. We would appreciate if users of ACEDB
could quote:
Richard Durbin and Jean Thierry Mieg (1991-). A C. elegans
Database. Documentation, code and data available from
anonymous FTP servers at lirmm.lirmm.fr,
cele.mrc-lmb.cam.ac.uk and ncbi.nlm.nih.gov.
Papers involved in database development could quote more
precisely:
I. Users' Guide. Included as part of the ACEDB distribution
kit,
II. Installation Guide. Included as part of the ACEDB
distribution
III. Configuration Guide. Included as part of the ACEDB
distribution
and the preprintkit, available by Anonymous FTP from ...
Jean Thierry-Mieg and Richard Durbin (1992). Syntactic
Definitions for the ACEDB Data Base Manager. Included as
part of the ACEDB distribution.
--Jean and Richard.
----------------------------------------------------------------------
Q8: Is ACEDB object-oriented?
A8: From the ACEDB User's Guide.
A major current vogue in computer languages and database design
is for ``object-oriented'' systems. It's also a source of lots
of argument. We are just trying to build a good system, and
don't want to get caught in the crossfire, but we do talk about
organising our data into objects and classes. We have undoubtedly
been influenced by many of the ideas going around, but it isn't
likely our system would be regarded as kosher by the object-
oriented community. In particular there is no class hierarchy, nor
inheritance, and it is written in a modular but non-ideological way
in straight C. However display and disk storage methods are class
dependent.
In some ways the class hierarchy is replaced by our system of
models and trees, which seems to be rather unusual. We think it
is very natural for the representation of biological information,
where for some members of a class a lot might be known about some
aspect, but for most only a little is known.
The advantages of our sytem over a relational database, such as
Oracle or Sybase, is our ability to refine our descriptions without
rebuilding the database and the possibility of organising the
storage of data on disk according to their class, i.e. we store in
a very different way the tree-objects and the long stretches of
DNA sequence.
----------------------------------------------------------------------
Q9: What's all this about Gopher/WAIS/Anonymous ftp/WWW ...
A9: These terms all refer to Internet protocols.
An excellent introduction to the Internet is:
_The Whole Internet User's Guide & Catalog_,
by Ed Krol, O'Reilly & Associates, 1992.
Or ask your system administrator to provide you with
a gopher client or mosaic client and begin navigating
on your own.
----------------------------------------------------------------------
Q10: How can I get on the ACEDB announcements mailing list?
A10: To get on or off the mailing list send mail to
rd at mrc-lmda.cam.ac.uk or mieg at kaa.cnrs-mop.fr.
New releases of the software are announced to
this list.
----------------------------------------------------------------------
Q411:Who contributed to this document?
[Note to international readers: 411 is the phone number for
information in the USA. --bks]
A411: Major contributions in getting this FAQ off the ground
were made by John McCarthy and Mike Cherry. Other
contributors include:
Lisa Lorenzen
David Matthews
Edie Paul
Donn Davy
Eric De Mund
Sam Cartinhour
To add or modify information in this document, please
send mail to: acedbfaq at s27w007.pswfs.gov
Bradley K. Sherman
Dendrome Project
Institute of Forest Genetics
P.O. Box 245, Berkeley, CA, 94701
Phone: 510-559-6437 Fax: 510-559-6440
The Dendrome Project and TreeGenes are funded by the
USDA-ARS Plant Genome Database Project.
---------------------End of file acedb-faq----------------------------