From owner-bio-srs@hgmp.mrc.ac.uk  Thu Apr 19 10:11:27 2001
Return-Path: <owner-bio-srs@hgmp.mrc.ac.uk>
Received: from localhost (localhost [127.0.0.1])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id 7AC3C17AEB
	for <bio-srs-outgoing>; Thu, 19 Apr 2001 10:11:26 +0100 (BST)
Received: from localhost (localhost [127.0.0.1])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id E06E617B5A
	for <bio-srs-list@hgmp.mrc.ac.uk>; Thu, 19 Apr 2001 10:11:17 +0100 (BST)
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 6015)
	id 36B4A17AEB; Thu, 19 Apr 2001 10:07:59 +0100 (BST)
Received: from localhost (localhost [127.0.0.1])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id 7C41617AC6
	for <bionet-software-srs@net.bio.net>; Mon, 16 Apr 2001 12:14:50 +0100 (BST)
Received: from niobium.hgmp.mrc.ac.uk (niobium [193.62.192.41])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id 4C4FA17A8B
	for <bionet-software-srs@net.bio.net>; Mon, 16 Apr 2001 12:14:47 +0100 (BST)
Received: (from news@localhost)
	by niobium.hgmp.mrc.ac.uk (8.9.3+Sun/8.8.8) id MAA02248
	for bionet-software-srs@net.bio.net; Mon, 16 Apr 2001 12:14:46 +0100 (BST)
To: bionet-software-srs@net.bio.net
From: Heikki Lehvaslaiho <heikki@ebi.ac.uk>
Newsgroups: bionet.software.srs
Subject: dbfetch - a CGI script for db entry retrieve
Organization: EMBL - EBI
Message-ID: <3ADAD425.68B652B8@ebi.ac.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: niobium.hgmp.mrc.ac.uk 987419686 2241 193.62.199.73 (16 Apr 2001 11:14:46 GMT)
X-Complaints-To: news@net.bio.net
NNTP-Posting-Date: Mon, 16 Apr 2001 11:14:46 +0000 (UTC)
X-Mailer: Mozilla 4.72C-SGI [en] (X11; I; IRIX64 6.5 IP28)
X-Accept-Language: en
Date: Thu, 19 Apr 2001 10:07:59 +0100 (BST)
Sender: owner-bio-srs@hgmp.mrc.ac.uk
Precedence: bulk


			     ANNOUNCEMENT
   ================================================================
			       dbfetch
   ================================================================
    a generic CGI program to retrieve biological database entries
	      in various formats and styles (using SRS)
   ================================================================
     by Heikki Lehvaslaiho <heikki@ebi.ac.uk>

ABSTRACT

The program dbfetch is a perl CGI.pm-based script to serve database
entries from up-to-date servers. It an extension of a interface used
by the EBI script emblfetch. Serving raw sequence entries via http
protocol makes it easy to create application programs accessing any
sequences by their id, only.


BACKGROUD

I started to write this program with these specifications in mind:

1. Retrieve biological database entries over the Web based on unique
   entry IDs.

2. Offer consistent, platform and database engine independent
   interface.

3. Easy-to-write URL syntax where the ID is simply added to end.

4. Serve entries not only in HTML but also in raw, easy to parse
   text-only format.

5. Modular, expandable structure.

Although the underlying database engine used in this script is SRS,
the program can easily be modified to access other indexing systems
(e.g. through the EMBOSS entret program).

This script is NOT (at the moment) offering free text or keyword
searches.

Most importantly, this approach is not dependant on some heavy hard to
maintain technology (CORBA). All it needs is a http connection and a
parser for a database ASCII format. These parsers are now available in
various open source projects (bioperl, biopython, biojava).

USAGE

Casual users need simple ways to access and browse database entries on
the web. The HTML form-based interface caters for these users. 

Increasingly, users of bioinformatics services write small programs to
analyze sequence and other database entries. However, it is difficult
to maintain locally up-to-date databases and, in a larger environment,
make those databases visible to all users. dbfetch makes it easy to
access database entries from anywhere.

As a first step, BioPerl modules Bio::DB::EMBL use dbfetch to retrieve
data into Bio::Seq objects. The whole process is writable in three
lines of BioPerl code:

  use Bio::DB::EMBL;
  $embl = new Bio::DB::EMBL;
  $seq = $embl->get_Seq_by_acc('J02231'); 
  # do what needed to the entry
  print "seqid is ", $seq->id, "\n";

Currently Bio::DB::SwissProt accesses swissprot entries from the
Expasy server and users can point their requests to its mirrors. This
Expasy script has limitations (not serving TREMBL entries) and not it
is not available to other databases. dbfetch tryies to overcome these
problems.


EXTENSIBILITY

The dbfetch uses local SRS calls to retrieve entries. Each style (html
or raw) is defined in its own subroutine. The details about each
database (name, update database name, id field names, format) is kept
in a global hash. An other hash stores a regular expression to
retrieve
a unique identifier from an entry. These two hashes and the subroutine
building the web page are the only places that need to be touched when
a new database is added. After modification it is advisable to run the
dbfetch from command line to trigger a subroutine which check the two
hashes for consistency.


WHAT YOU COULD DO

Please install dbfetch to your local server and let me know that it is
available for inclusion into bioperl modules.

Bioperl and related open source projects (e.g. biojava and biopython)
have so far focused on sequence analysis. dbfetch makes it easier than
ever to work with other data types, If you are willing to create or
have suitable code for parsing and creating objects for other formats,
please join in.  To start with, the BioPerl project would welcome
classes to store an manipulate literature reference (Medline) and
protein structure (PDB) data which are database entries served by
the current EBI dbfetch script.


AVAILABILITY

The dbfetch script is running at:

	http://www.ebi.ac.uk/cgi-bin/dbfetch

It is available under Perl artistic license from the BioPerl
(http://bioperl.org) CVS repository or directly from:


http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/scripts/DB/dbfetch?cvsroot=bioperl

and in due course in the next release (0.8) of BioPerl.


From owner-bio-srs@hgmp.mrc.ac.uk  Mon Apr 23 10:21:08 2001
Return-Path: <owner-bio-srs@hgmp.mrc.ac.uk>
Received: from localhost (localhost [127.0.0.1])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id 213DE17AC6
	for <bio-srs-outgoing>; Mon, 23 Apr 2001 10:21:05 +0100 (BST)
Received: from localhost (localhost [127.0.0.1])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id CA5F917ACE
	for <bio-srs-list@hgmp.mrc.ac.uk>; Mon, 23 Apr 2001 10:21:02 +0100 (BST)
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 6015)
	id 381FD17AC6; Mon, 23 Apr 2001 10:20:54 +0100 (BST)
Received: from localhost (localhost [127.0.0.1])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id DA98117A70
	for <bionet-software-srs@net.bio.net>; Fri, 20 Apr 2001 16:25:19 +0100 (BST)
Received: from sprettur.isnet.is (sprettur.isnet.is [193.4.58.19])
	by mercury.hgmp.mrc.ac.uk (Postfix) with ESMTP id 9DF6D17A63
	for <bionet-software-srs@net.bio.net>; Fri, 20 Apr 2001 16:25:13 +0100 (BST)
Received: from guppy.vub.ac.be (guppy.vub.ac.be [134.184.129.2])
	by sprettur.isnet.is (8.11.1/8.11.0/isnet) with ESMTP id f3KFP9T57994
	for <bionet-software-srs@moderators.isc.org>; Fri, 20 Apr 2001 15:25:11 GMT
	(envelope-from news@vub.ac.be)
Received: from snic.vub.ac.be (snic.vub.ac.be [134.184.129.20]) by guppy.vub.ac.be (8.9.1b+Sun/3.17.1.ap (guppy))
        id RAA18653; Fri, 20 Apr 2001 17:24:28 +0200 (MET DST) for <bionet-software-srs@moderators.isc.org>
Received: (news@localhost) by snic.vub.ac.be (8.9.3/%I%.0.ap (snic.test))
        id RAA26440; Fri, 20 Apr 2001 17:25:02 +0200 (MET DST) for bionet-software-srs@moderators.isc.org
To: bionet-software-srs@moderators.isc.org
From: Guy Bottu <gbottu@bigben.vub.ac.be>
Newsgroups: bionet.software.srs
Subject: what about transfac ?
Organization: Belgian EMBnet Node
Message-ID: <9bpkce$n20$1@snic.vub.ac.be>
X-Trace: snic.vub.ac.be 987780302 23616 134.184.15.23 (20 Apr 2001 15:25:02 GMT)
X-Complaints-To: usenet@snic.vub.ac.be
NNTP-Posting-Date: 20 Apr 2001 15:25:02 GMT
User-Agent: tin/1.4.3-20000502 ("Marian") (UNIX) (SunOS/5.7 (sun4u))
Date: Mon, 23 Apr 2001 10:20:54 +0100 (BST)
Sender: owner-bio-srs@hgmp.mrc.ac.uk
Precedence: bulk

	Dear colleagues,

For the moment a number of public SRS servers have the TRANSFAC database
version 4.0. As you might have noticed, there is now a version 5, but
the copyright of TRANSFAC is owned by Biobase and it seems that not
only must you pay to have a local copy, it is also forbidden to make it
available to third parties. This will have a consequence for our SRS
servers. Anyone a comment ?

	Dr. Guy Bottu


