Fri Oct 18 10:02:07 EST 1991

Recently Tom Schneider posted a heated message in which he severely
criticized the performance of the GenBank contractor and strongly
implied that for this reason the contract was being "taken away" from
them.  The first part is clearly an opinion that Tom is entitled to
express; the second is factually incorrect.

With respect to the performance of the GenBank contractor, I think it
is important to look at how far we have come with this essential
database.  Five years ago the database was fourteen months behind the
literature and the gap was growing.  As a stopgap measure, the GenBank
contractor entered unannotated sequences.  Although this division was
meant to be a temporary parking place, the division continued to grow.
In addition access to GenBank data was difficult at best.

Now the situation has been reversed.  Thanks to the efforts of the
GenBank staff, 85 % of all sequences are submitted in advance of
publication.  In those cases where the author has not asked that the
sequence data be held until publication, they are generally available
when the journal hits the streets.  The rest are available within a
very few weeks.  When we wrote the original statement of work for the
GenBank contract, we felt that if the contractor could come within
three months of the literature, we would be in seventh heaven.

These advances would have little impact were it not for improved
distribution.  The database is now updated daily and the sequences
available by e-mail server.  In addition, the on-line service now
features IRX, which makes it possible to find entries when you are not
really sure what you are looking for.  E-mail servers are now
available for FASTA and BLAST searches against the entire database.  I
am not sure whether the impact of the latter has really reached the
scientific community, but one incident will illustrate the importance
of this service.  I was at a site visit for a grant application
dealing with structural biology, and the PI indicted that his group
needed a new microvax because the current one was tied up by
researchers doing searches against GenBank.  He indicated that these
searches took overnight.  I told him this was stupid because he could
use the e-mail server and get the results in a few minutes from
GenBank.  Thus GenBank could give him faster service and he did not
have to use cpu's to tie up his machine.  In fact the e-mail servers
work so fast that if you have good e-mail service at your institution,
they are virtually interactive.  This means you do not have to tie up
disk space storing GenBank.

In addition other aspects, such as the development of an RDBMS for
storing the data and author entry software, AUTHORIN, have been
significant achievements.

As part of the GenBank contract, the GenBank advisors critically
evaluate the performance of the GenBank contractor on an annual basis.
I think it is fair to say that the last couple of meetings have seen
them bubbling with enthusiasm over the performance of the contractor.

Whether GenBank should be concentrating its effort in presenting
merged views of the data is an arguable point.  I think most would
agree that the focus should not be on this task if it comes at the
expense of currency and completeness.  How much merging of data from
different laboratories and the resulting bias that this introduces
into the database is a subject on which many reasonable scientists
would disagree.

On the subject of the future of GenBank it is very important to set
the record straight.  GenBank is currently funded by a contract
awarded by the National Institute for General Sciences to
Intelligenetics, Inc. with a subcontract to Los Alamos National
Laboratories.  In fact NIGMS owns the GenBank trademark.  NIGMS is in
the business of funding research grants not contracts for resources.
We got into this business because in the early eighties no one else
would.  Thus if our grantees were to be served, we had to do it
ourselves.  Elke Jordan, who did lot of the groundwork and Christie
Carrico, the first Project Officer at NIGMS, deserve a lot of credit
for beginning this effort.  The original contractor, BBN, and LANL
under the direction of Walter Goad, similarly deserve a lot of credit
for working under extremely difficult conditions.

When Don Lindberg became director, the National Library of Medicine
moved rapidly into the computer age and took a direct interest in
biotechnology databases.  This move was fostered by a workshop
sponsored by the National Academy of Sciences that lead directly to
the Pepper Bill that created the National Center for Biotechnology
Research.  In fact I was a member of the Academy committee that set up
the workshop and remember several late nights pounding out a draft of
parts of that report on my Commodore 64.

Subsequently, we at NIGMS formed a close working relationship with the
Lister Hill Center, which is part of NLM.  I remember vividly meeting
Dennis Bensen, then at Lister Hill, at a course.  He told me about
software they were developing to link databases and I told him about
GenBank.  Within two weeks he had GenBank running under IRX.  This
turned out to be a fruitful collaboration for a number of reasons.
First we had access to computer expertise, a subject lacking in my
background.  Second they provided software, such as IRX, BLAST, etc.
which has been extremely important to the biological community.  Third
they provided additional funds which have made it possible for GenBank
to expand into other areas, such as the enhanced on-line service.

When NCBI was formed and David Lipman became its director, this
collaboration and friendship have continued.  Thus when we began to
discuss what the next generation of GenBank, to me it was sort of
natural to turn over responsibility to a group who were recognized
experts in the field.  NCBI has made a conscious decision to take
advantage of the resources of the library to do much of the work
in-house.  However, they will also take advantage of the unique
capabilities of LANL with respect to author entered data.  The
builders of the next generation of GenBank will greatly benefit from
the experiences of the current generation and from having a time to
work in which the pressures of putting out releases every quarter were
absent.  In any case I think the folks at NCBI will have their hands
full matching the excellent record for customer service rendered by
the current contractor, Intelligenetics.

I hope these comments have served to set the record straight.

Jim Cassatt
GenBank Project Officer

More information about the Bioforum mailing list