Update on the AAtDB Research Companion Gopher Service

Mike Cherry CHERRY at FRODO.MGH.HARVARD.EDU
Tue Sep 14 19:23:56 EST 1993


Update on the AAtDB Research Companion, an electronic resource for the
worldwide Arabidopsis community.

Information contained within the AAtDB database and several other
collections of Arabidopsis information is available using a computer
connected to the worldwide Internet network. This information is
available using the Internet Gopher software or via public access
accounts.

The Internet Gopher software requires a computer that is directly
connected to the Internet (a simple modem connection is not
sufficient). The Gopher software that can be installed on your
computer is called the "client". The client interacts via the Internet
with any of several hundred different Gopher "servers" that provide
information to the client. Gopher clients also work with other
software on the local computer to present non-text information, for
example graphic images and sound recordings. Gopher clients are
available for most major models of computer. More information about
installing a Gopher client on your computer is given below.

Much of the information that Gopher provides is made accessible using
the WAIS (Wide Area Information Server) software developed by Thinking
Machines, Inc. The software creates an index containing the location
of every word in the original files. This allows the user to search
for any and all words, rather than just keywords that someone else
thought were sufficient to describe the document. WAIS indexed
collections allow the user to go directly to all parts of a document
or database that they are interested in exploring. A WAIS index is
recognized by a question mark ('?') icon on the Gopher menu. In
text-only gopher clients, the name of the WAIS resource has "<?>"
added to the name.

Gopher allows a variety of information to be presented in menus,
including but not limited to plain text files and Macintosh BinHex
encoded documents. Gopher also links to other types of resources. The
WAIS search engine mentioned above is one of the most powerful of
these. In addition, Gopher provides access to FTP servers.  FTP is a
utility used on the Internet to provide and transfer files between
computers.  With Gopher creating a menu of the files provided by an
FTP server the task of finding and retrieving a file is greatly
simplified. Gopher provides a very low cost method for the
distribution of large amounts of information to anyone connected to the
Internet. The AAtDB Research Companion Gopher server receives greater
than 100 connections per day for users scattered around the world.

The AAtDB Research Companion for Arabidopsis Information is provided
by the Department of Molecular Biology at Massachusetts General
Hospital with the assistance of Digital Equipment Corporation and the
United States Department of Agriculture Plant Genome Research Program
through the National Agricultural Library. The AAtDB Research
Companion provides the complete information contained within the
workstation version of the AAtDB database that was described
elsewhere. In addition, the Companion offers the following:


AAtDB FTP archive

The AAtDB FTP archive is one of the menu items that connects to the
archive of files normally available via anonymous FTP from
weeds.mgh.harvard.edu. The Research Companion makes it possible to
retrieve the Unix AAtDB files to a Unix computer without using FTP at
the command line.  Included in the AAtDB FTP archive are installation
instructions for AAtDB, Macintosh and Unix Gopher Clients, as well as
a text version of the 1992 NSF Report on Arabidopsis Research.


Images

All image files distributed as part of the AAtDB database are provided
in the folder titled 'AAtDB Images'. These images include
autoradiograms of diagnostic Southern Blots of the Goodman labs RFLP
markers; ethidium Bromide stained agarose gels of restriction digests
of the Goodman cosmids and Meyerowitz lambda clones distributed by the
ABRC in Columbus, Ohio; and photographs taken by George Redei, Mary
Anderson, or the staff of the ABRC showing phenotypes of particular
Arabidopsis mutations or ecotypes. All of the images are currently
available in the GIF (Graphic Interchange Format) or JPEG (Joint
Photographic Experts Group). The GIF format provides some compression
so the image file size is reduced yet no quality or resolution is
lost. The JPEG format is designed for compressing images of natural
scenes and optimizes the compression to lose information that is not
perceived by the human eye. The GIF and JPEG image formats can readily
be viewed with current graphics viewing software on most workstations
and personal computers.


Geographic Distribution of Arabidopsis

Jonathan Clarke produced this world map representing the distribution
of Arabidopsis thaliana (L.) HEYNH. The image is provided in PICT, GIF
and Encapsulated Postscript formats.


Genetic Maps and Tables

Several Arabidopsis genetic maps are provided in tabular form.  Tables
of gene symbols and known genes (created by David Meinke) are also
included.


BioSci Arabidopsis Genome Electronic Conference

The BioSci service funded by the NSF through Intelligenetics provides
electronic mailing lists and Usenet newsgroups to the public.  Since
July 1990 the BioSci Arabidopsis Genome group has been a forum for
scientists around the world to share questions, announcements,
protocols, job postings, and data. All messages posted to the
Arabidopsis BioSci group have been archived and are provided as a WAIS
query database. As of August 1993 there have been over 1300 message
included in this group. The WAIS index of the archive is updated
daily.


Arabidopsis: Compleat Guide

The Compleat Guide, a collection of protocols on how to work with
Arabidopsis, was compiled by Caroline Dean and David Flanders. The
guide is available as a Macintosh self extracting archive (SEA)
containing documents in WriteNow 3.0 format with images in GIF and
PICT formats.  The documents and figures are also provided separately
as text files, images as the original GIF and PICT files, as well
as a WAIS query index.


Nottingham Arabidopsis Stock Centre Catalog

The Nottingham Arabidopsis Stock Centre (NASC) at the University of
Nottingham in England provides Arabidopsis seed to Europe, Africa and
Asia. The catalog for the Stock Centre is provided as Macintosh Word
documents, text files and as a WAIS query index. This information is
updated by Mary Anderson and contains the most recent information
available from the Nottingham Centre. Users may wish to query the WAIS
index using the term 'About*' in order to see discussions on a range
of topics, from how to grow Arabidopsis to the origin of the the
Landsberg and Columbia lines.


Arabidopsis Information Service

The Arabidopsis Information Service (AIS), edited and published by A.
R. Kranz, has recently been made available by the AAtDB Project. All
volumes of AIS have been converted into an electronic form. The text
and figure legends have been transcribed into text documents and the
tables and figures have been scanned into GIF formatted image files.
The entire collection, all 25 years or 27 volumes is available as a
WAIS query index.


Arabidopsis cDNA Sequences in dbEST

The National Center for Biotechnology Information (NCBI) provides a
database of cDNA sequences. As part of this service the NCBI produces
reports on each sequence containing information provided by the
submitting laboratory. Each sequence is also searched against motif
pattern, nucleotide, and peptide sequence databases and the results
included in the report. Arabidopsis ESTs are associated with known
proteins or activities in this manner. This information is easy to
explore since the Research Companion offers a WAIS query index of the
dbEST reports. Thus you can search for kinase, homeo* or whatever you
are interested in exploring. Note that many of the clones used to
obtain the cDNA sequences are available from the ABRC at Ohio State.
The reports are periodically updated. The latest update was received
on September 14, 1993 and contains 2320 sequences.


Using the AAtDB Research Companion

The AAtDB Research Companion provides all of the information in the
Unix workstation version of AAtDB in a WAIS indexed form. However, the
graphics displays built into the ACEDB software are not available via
the Gopher clients. These include the genetic and physical map
displays, filter grid displays and sequence feature displays. The
Gopher client also does not provide the sequence analysis and complex
query features provided by the ACEDB software. Nonetheless, a very
large amount of information can be searched for and retrieved via the
Gopher client.

The best strategy is to start with a word or two that describes the
specific topic you are interested in exploring. For example: homeobox,
kinase, tt4, or Boston are examples of individual search words that
will return all objects in the database that contain that word. The
'tt4' query will find objects that are members of the locus, paper,
chromosome, sequence, strain, 2_point_data, population, clone,
gene_class, and probe classes of the AAtDB database. Each object can
be independently viewed. If the query results in zero or too few hits,
you should try a different word or use the asterisk ('*') wildcard
character. The wildcard can only be added to the end of a word (this
is a current limitation of the WAIS software). Thus in the examples
above it might be better to search for 'homeo*' instead of homeobox.
With 'homeo*' you will match objects containing any of the following
words: homeo, homeobox, homeotic, homeodomain, and any other word
beginning with 'homeo'.

If too many objects are found matching the query you can use the
search modifiers 'and' and 'not'. The WAIS software assumes there is a
'or' between all words in a query unless you explicitly state an 'and'
or 'not'. Thus if you query for 'gene expression' you will receive a
list of objects that contain either the word expression or the word
gene. This is probably not what was intended by this search. To limit
the result to objects that contain both words, query for 'gene and
expression'. If you really want to search for the literal phrase 'gene
expression' then you should surround the query with double quotes as
in '"gene expression"'.

Some queries are not very useful. For example, since all information
in the database is about Arabidopsis, we have removed 'Arabidopsis' as
a keyword. Other examples are 'clone' or 'paper' by themselves since
there are over 14,000 objects in the Clone class of AAtDB and over
3,000 objects in the Paper class. The WAIS software will only return a
default maximum of 256 matching objects. If you want to see a larger
number of matches then you must add a greater than symbol ('>') and
the number hits you wish to see to the end of the query. For example:
'gene >500' will return 500 hits instead of the default 256.


Obtaining the Internet Gopher client software

A


More information about the Arab-gen mailing list