GenBank Typography

David Kristofferson kristoff at
Fri Oct 25 21:32:40 EST 1991

> 3.  If a G1 who is experienced with neither UNIX nor C can
>     generate a list of typos in about 1/2 a day using
>     a 1 page C program and standard UNIX utilities, why can't the
>     databases do this on a regular basis?

I will be out of the office for the next week but, before I leave, I
want to make a general comment about the above and notes with similar
negative connotations.

The technical issue raised is in LANL's domain and I will leave
technical comments to them.  However, I want to make a personal
observation so that maybe some of you might get a flavor for what this
work can be like before you pop off any more.

In undergraduate school I was the most precise student in the entire
analytical chemistry lab class at UC San Diego (we are talking about a
few hundred students here) with every single one of my determinations
coming in to within less than a percent of the target (this is not
exageration).  Those who have worked with me in the lab later in life
can also attest to the care that I have put into my work.  Running a
database and a timesharing service puts a whole different light on
things, however.  You have people calling you all the time with
problems, many of them things that happened years ago which you had no
responsibility for, but you have to try to satisfy them.  You have
scores of electronic mail messages coming in from around the world.
You have deadlines to process massive amounts of data without making
mistakes because, if you make *any*, someone is bound to call you on
it at some time and rejoice in the kill.  You work overtime and try
your best to satisfy people's demands and try hard to grin and bear
the usual slings and arrows without complaint.  All during this you
wonder what ever happened to the perfectionist in you during the rush
to keep up with the work.  This description applies not only to
myself.  Many aspects are shared by a lot of the other GenBank staff.

To illustrate the thanks that one receives, allow me to relate a story
about another GenBank staff member.  While conversing with a Famous
Scientist about some *important scientific matter* and having the
audacity to express an opinion, this person was told "You are not a
biologist; you are a servant of biology!"  I was incredulous when I
heard about this incident!  Fortunately for the Famous Scientist the
person in his "presence" had more manners than I would have.

I can guarantee all of you that there are concerned and committed
scientists at both Los Alamos and IntelliGenetics who work hard and
*try* to do a good job (but unfortunately we do not always succeed).
YOU SHOULD ALL REMEMBER that this project was significantly
underfunded and the amount of work underestimated during its first
five years (i.e., during the first GenBank contract).  This is not to
criticize anyone then; hindsight is always 20-20.  Consequently, an
awful lot of problems developed during the data onslaught that took
tremendous work to resolve later on.  Many problems are still
unresolved.  I agree with all of you that it is not difficult to find
problems in the database, and I am sure that these problems can all be
presented in public with scorn and comments about "sloppy
workmanship," "why aren't they doing this obviously simple thing," "no
wonder they lost the contract," etc.  As Paul Gilna mentioned earlier,
the staff is trying to deal with difficulties on several fronts and
has made *considerable* progress on many of these.

NCBI was formed, in my personal opinion, in response to the the
problems that developed earlier on.  They will be taking over this
work in the not-too-distant future (about a year from now).  A very
determined effort has been made to get high quality people involved in
the effort there.  They have studied the past and will hopefully learn
from the errors made.  They also have the chance to begin at least
partially anew and to take advantage of some new ideas.

However, when the full spotlight is on them, it is inevitable that
some mistakes will be made and that some new ideas will not pan out.
The projected increase in the data flow makes each mistake more
difficult to recover from.  

Will the scientific community have grown any more understanding in the
interim?  I hope so, but fear that it may be "deja vu all over again."

Perhaps we need more "finishing school" graduates and less Ph.D.'s 8-).

I'm outta here .... see you all back on the 5th for more tea and cookies 8-)!!


			Dave (I blush to admit to a Ph.D.) Kristofferson
			GenBank Manager

			kristoff at

More information about the Bioforum mailing list