access to genbank
Reinhard Doelz
doelz
Mon Dec 5 03:55:12 EST 1994
[ Article crossposted from bionet.molbio.genbank ]
[ Posted on Mon, 5 Dec 1994 08:53:30 GMT ]
. (sreflex at usis.com) wrote:
: w do i access genbank or any other programs for genetics?
There is a major difference in three disciplines, which are basically
'retrieval' => search in the sequence annotation
'searching' => search in the sequence records
'manipulation' => correlate the results obtained above, and manipulate
own sequence data
The methods which are available for the three disciplines vary and are
located in three 'worlds'.
Desktop => PC or Mac, nice Graphical User Interface, problems with large
data sets and heavy computations
Mainframe => Some central institution at your site, includes a local
support person mostly, usually OK in CPU and data resources,
focus on command-line or X-Windows interface
Network => The 'free' servers on the net deliver mostly retrieval and
searching, but are generally weak on trivial analysis and
manipulation. Data from the net are occasionally difficult
to integrate into local (Desktop or Mainframe) environments.
Your option to use Desktop products is to go fancy and commercial (for
its price, a good solution for well-equipped small sites), or public
domain (there is the BIOCAT catalogue, for an example, see
http://www.ebi.ac.uk/biocat/biocat.html on the World-Wide Web system).
but you will have to install many of the options to get a reasonably
complete set. In particular, data updates for commercial products
might be a financial problem.
Mainframe installations are the optimal solution for medium sites
and upper, as central staff takes you off the burden to get everything
installed and going. In Europe, the European Molecular Network (EMBnet,
see http://www.embnet.unibas.ch/embnet.news/vol1_1/item_9.html for
a listing of last summer) will offer services on national basis for
their scientists. On Mainframes, you install typically a 'package'
and add utilities to it; major commercial products are the widely used
'GCG package' (Genetics Computer Group, Inc.) and 'IG Suite' (Intelli-
genetics , Inc.) or public domain ('STADEN' by Roger staden, Cambridge).
Obviously, you cannot expect to get the commercial products on FTP servers
but hep is available by eMail if you are interested in purchases.
Network installations are huge in number (See Keith's list at
http://golgi.harvard.edu/biopages.html), and grow permanently, there-
fore it might be difficult for you to follow the developments- described
in the bionet.software.www newsgroup (a on-line WWW archive is maintained
in http://www.ch.embnet.org/bio-www/info.html). There are four main
methods currently to access network resources:
WWW "World Wide Web" is the famous network access protocol of today. Based
on the idea that you can link texts by attaching 'links' to individual
words, paragraphs, images etc. the appearance to the user is quite
appealing. Best run with 'graphic' browsers on Workstations or PC/Mac
connected to the internet, but 'text-mode' browsers are available
on many mainframe installations. The WWW allows very sphisticated
searches of the annotation of databases - to give an example, there
are many servers now on the network which offer the Sequence Retrieval
system (SRS) by T.Etzold on WWW (check out the current global status
at http://www.ch.embnet.org/srs/index.html). There are too many others
to mention, see Keith's list as mentioned above for details.
GOPHER is still a working horse for many and is currently supported by
some sites. Similar to WWW, GOPHER is much less demanding in network
resources and offers trivial search facilities as well, but lacks
mostly the ability to no multi-parameter input.
ELECTRONIC MAIL is useful if this is the only access to electronic
networks for you. There are many services which are currently only
available on electronic Mail. Amos Bairoch has produced a list of
servers, the file is called 'serv_ema.txt' and available on
ftp://nic.switch.ch:/mirror/embnet-ch/info/MORE_INFO_ON_VARIOUS_ISSUES
including lots of other issues.
SPECIAL-PURPOSE protocols gain growing importance as the dedicated question
might require more sophisticated information exchange than the pre-
viously listed approaches allow. The US leader in the field is the
NCBI with a sepcial service for BLAST and ENTREZ; the first being for
sequence searching, the latter for retrieval (also referenced as
'network blast' and 'netentrez'). More information for WWW users on
http://www.ncbi.nlm.nih.gov/ .Our site has contributed the Hierarchical
Access System for Sequence Libraries in Europe (HASSLE), which is
a new protocol for various services and has built-in fault tolerance
and resource discovery. You might want to look up the following URL:
http://beta.embnet.unibas.ch/basel/science/info.html (overview).
Maybe this helps
Regards
Reinhard
--
R.Doelz Klingelbergstr.70| Tel. x41 61 267 2247 Fax x41 61 267 2078|
Biocomputing CH 4056 Basel| electronic Mail doelz at ubaclu.unibas.ch|
Biozentrum der Universitaet Basel|-------------- Switzerland ---------------|
<a href=http://beta.embnet.unibas.ch/>EMBnet Switzerland:info at ch.embnet.org</a>
More information about the Bio-www
mailing list