Sequence Database Management

Stephen R. Lasky srlasky at
Sun Nov 1 18:43:38 EST 1998

Jason Barbour wrote:

> Does anyone know of a good sequence database management
> system?
> We have looked at Biolims (ABI) but think that it is more
> than we need.
> We use DNASTAR, ClustalW and the Phylip suite for analysis
> purposes.
> Jason Barbour
> Gladstone Institute of Virology and Immunology

We are beta testing a new db specifically designed as a chromat server
to organize small to large sequencing projects.  The group that is
putting it together is called Geospiza and you can see their site at

First the usual disclaimers:  I am not associated with Geospiza other
than as a client and beta tester.  I don't get anything for spreading
the word that they are around.

What Geospiza is trying to do is to set up a turnkey operation for
storing, viewing, analyzing, and assessing (QC/QA type) chromats off the
pe/abd sequencers.  Turnkey in that they supply the hardware and
software for your system and you turn it on and run it through a web
interface (any new browser will work).  It uses inexpensive intel boxes
with Linux mounted as the OS.  It uses phred and phrap to call and align

One of the things I like about it (running a sequencing group that not
only does genomic  sequencing projects, but also est and any other
sequencing that the people in the hoodlab need), is that the system has
several levels of organization for the files:  From cost centers down to
clones in a library.

It also gives output organized by library that includes ecoli, vector
and short insert percentages (failure rates).  Individual phred Q values
are graphed for each sequencing lane, so you can see how good the data
looks.   They generate an image of the chromat that can be viewed in a
browser window, which is nice.

All the data is stored in a rdbms in a tar.gzip format so you save a lot
of space (given each chromat is a file of about 190K now).

There are a bunch of other features that are being implemented while we
are beta testing it, such as sample sheet generators that can be
customized, different report formats, and different export formats.
They are also interfacing the chromat server with a blast server so that
contigs can rapidly and automatically be searched (in batch mode)
against different databases to facilitate gene or motif identification.

Like I said I am beta testing their chromat server.  I am doing my best
to break it by overwhelming the db with chromat files and users.  So far
I am pretty happy with the system.  I am very happy with the
responsiveness of the Geospiza people (one of whom has a phd in
medicinal pharm and was a sequencer himself at one time) in fixing
problems and expanding the features list. If you want to know anything
more about this system, you can get in touch with Geospiza through their
web site.


Stephen R. Lasky, Ph.D.
University of Washington
Dept. of Molecular Biotechnology
srlasky at

More information about the Autoseq mailing list