GenBank Curator Program

Paul Gilna pgil%histone at LANL.GOV
Thu Aug 9 18:09:15 EST 1990

Over the past two years, the primary thrust of the GenBank project has
been to improve the timeliness and completeness of the database.
Endeavours such as the interaction with journals, sequence submission
policies, and new submission software tools have brought us to the
point where we now receive 80% of our data in electronic form
directly from the scientific community and where our average turnaround
is now measured in weeks rather than months.  This progress in
soliciting direct and automated data submission, and in the RDBMS
conversion now free us to deal in greater detail with one of the most
important components of the database, the biology represented within
the annotation. In addition to our work to enrich the quality of the
annotation using our own annotation resources, we now wish to seek the
direct involvment of the members of the scientific community.

The following announcment represents the beginning of a program to aid
us to enhance the quality and integrity of the data represented in the
GenBank database.

This announcment will only be distributed via e-mail for the pilot phase,
however recipients are free to redistribute this notice. This notice is
being posted to both the GENBANK-BB and BIONEWS bulletin boards and we
apologize in advance for any redundancy across the two newsgroups.

Paul Gilna
GenBank Biology Domain Leader
Los Alamos National Laboratory
Los Alamos, NM 87545

pgil%histone at

Tel: (505) 665-2177
Fax: (505) 665-3493


	GenBank announces the pilot phase of the GenBank Curator
	Program. We are seeking suggestions for work to be done on the
	database in the form of informal proposals.  Authors of
	successful proposals will travel to Los Alamos and work with
	the annotation or computation staff to carry out their proposed

	Although GenBank has had some curators in the past, the advent
	of the GenBank RDBMS restructuring and its attendant interface,
	the Annotator's Workbench, allows us to implement an expanded
	program using a unified, intuitive annotation tool that
	provides the capability of remote use.

	The current program seeks to identify domains within the
	database that are in need of overhaul either at the sequence or
	at the annotation level. In addition, as part of ongoing
	development of the Sequence Validation Suite (SVS), a suite of
	software programs that will be used to check the validity of
	submitted sequence and annotation data, we have expanded the
	program to include software development associated with the

	We are looking to the readership of the molecular
	biology-oriented Bulletin Boards for proposals for curation on
	GenBank; if you are familiar with a domain or family of
	sequences represented within the database and with the existing
	annotation, and have some ideas on how the annotation could be
	improved (for example to reflect similarities in features
	across entries, to improve existing nomenclature, or to point
	out sequence merges), or on software that could be developed to
	aid data integrity and validation, then we would like to hear
	from you.

	In this pilot study, about six proposals will be selected to be
	implemented before the end of September, 1990. Based on the
	results of the study, we hope to take on about 30 or so more
	projects over the course of the next two years. The capability
	exists for continued interaction with the data bank staff on a
	consultant basis, using remote access facilities to the
	annotation software. The work will be carried out on site at
	Los Alamos. Travel (within the US for the pilot study), hotel
	costs, and subsistence will be covered. Project proposals will
	be reviewed by GenBank and NIH staff. Proposals should be
	submitted to Dr. Paul Gilna via e-mail (pgil%histone at
	and should cover the following topics:

	o       Detailled description of work proposed, citing examples from
 		the database, where relevant, and of the scope of the 
		proposed work

	o       Justification of work in terms of benefit to community
		and data bank

	o       Estimation of time needed to conduct work at LANL

	o       Abbreviated CV including representative publications.

