BCM Annotation server

Reinhard Doelz doelz
Thu Sep 15 01:31:39 EST 1994


[ Article crossposted from bionet.molbio.genbank ]
[ Author was Brian Foley ]
[ Posted on Wed, 14 Sep 1994 20:58:22 GMT ]

	I have added a few annotations to entries for elongation factor
2 genes/cDNAs, and I have gone back to view these entries after annotation.
I have also called up these entries with NCBI NENTREZ.  The annotations
appear when I call up the entries from within BCM, but not when they
are called up directly from NCBI.

	1) Will the annotation "post-it notes" become a part of the
NCBI ENTREZ system?  Or will they just be tacked on on top of ENTREZ
entries stored at BCM?  If they are added to ENTREZ at NCBI, what will 
the lag time be between adding annotations, and their entry into the
ENTREZ system?

	2) I really like the idea of being able to add annotations to
GenBank entries, but I very much dislike using a WWW browser to do so.
I used to work at GenBank as a sequence annotator, and even back in the
days before the "Annotator's Workbench" was developed we had a few very
important tools at hand to help us do accurate and consistent annotation.
The WWW server just gives me a window to type some text in.  Not even a 
spell checker is provided.  I can see the ENTREZ entry below if I scroll
down there, but if I position my cursor over the A of an ATG start codon
I want to annotate, it does not tell me the base number, I have to count
it out by hand (error-prone).   
	There is a window to type in "keywords" but no way for me to check
a list of keywords to keep things consistent (I might use "EF-2" and another
person might use "EF2" or "ef2" or "elongation factor").

	The idea of an "annotation server" is great.  But I would be a lot
better off if I had the ability to use the "anotators workbench" or AUTHORIN
or even Microsoft Word (at least it has a spell-checker) to spend some time
making up a realy nice entry and checking the results before it gets pasted
in stone into a world-accessible database.
	
	3) Perhaps the "Links to WWW URLs" is the best feature of the
BCM service.  I see the addition of free-form comments to be very
limmited in usefulness.  However there are sites such as:

On-line Directory of  
   <a href="http://www.icgeb.trieste.it/p450/">P450-containing Systems</a>
, developed at 
   <a href="http://www.icgeb.trieste.it"> International Centre for
         Genetic Engineering and Biotechnology</a>
, 
        <a href="http://ale2ts.ts.infn.it:6163/TS/foto.html">Trieste</a>
 
	that provide much more than simple "annotation" of a single
GenBank entry.  Such sites can supply multiple sequence alignments and
links to other databases.

	This is great, but if we read Tom Schneider's "Philosophy"
papers and look in the ASN.1 documentation, we'll see that we should
really be working toward standardization so that if the folks in
Trieste do a multiple sequence alignment of P450 proteins, and I do a 
multiple sequence alignment of EF-2 proteins, the same software can
be used to analyze both sets of data.
	My interest may be in looking at a certain conserved region
of amino acids involved in catalysis, but some evolutionary biologist
may want to use both the P450 and EF-2 data to generate an evolutionary
tree.  

	It is frustrating to wait for the perfect system to come
along, when we want to build something TODAY.  But we should at least
be working toward building the best system we can.  The "annotators
workbench", AUTHORIN, and other great tools are already built, yet
they are not included in the BCM server.  We have ASN.1 definitions
for all sorts of DNA sequence features, but few tools to help us
incorporate them into WWW sites so that our WWW server feeds out
standard objects, instead of free-form text.


--
********************************************************************
*  Brian Foley               *     If we knew what we were doing   *
*  Molecular Genetics Dept.  *     it wouldn't be called research  *
*  University of Vermont     *                                     *
********************************************************************
-- 
 R.Doelz         Klingelbergstr.70| Tel. x41 61 267 2247  Fax x41 61 267 2078|
 Biocomputing        CH 4056 Basel| electronic Mail    doelz at ubaclu.unibas.ch|
 Biozentrum der Universitaet Basel|-------------- Switzerland ---------------|
<a href=http://beta.embnet.unibas.ch/>EMBnet Switzerland:info at ch.embnet.org</a> 




More information about the Bio-www mailing list