access to genbank

Reinhard Doelz doelz
Mon Dec 5 03:55:12 EST 1994


[ Article crossposted from bionet.molbio.genbank ]
[ Posted on Mon, 5 Dec 1994 08:53:30 GMT ]

. (sreflex at usis.com) wrote:
: w do i access genbank or any other programs for genetics?

There is a major difference in three disciplines, which are basically 

'retrieval' =>    search in the sequence annotation 
'searching' =>    search in the sequence records
'manipulation' => correlate the results obtained above, and manipulate 
                     own sequence data 

The methods which are available for the three disciplines vary and are 
located in three 'worlds'. 

Desktop => PC or Mac, nice Graphical User Interface, problems with large
           data sets and heavy computations 
Mainframe => Some central institution at your site, includes a local 
           support person mostly, usually OK in CPU and data resources, 
           focus on command-line or X-Windows interface 
Network => The 'free' servers on the net deliver mostly retrieval and 
           searching, but are generally weak on trivial analysis and 
           manipulation. Data from the net are occasionally difficult 
           to integrate into local (Desktop or Mainframe) environments.

Your option to use Desktop products is to go fancy and commercial (for 
its price, a good solution for well-equipped small sites), or public 
domain (there is the BIOCAT catalogue, for an example, see 
http://www.ebi.ac.uk/biocat/biocat.html on the World-Wide Web system). 
but you will have to install many of the options to get a reasonably 
complete set. In particular, data updates for commercial products 
might be a financial problem.  

Mainframe installations are the optimal solution for medium sites 
and upper, as central staff takes you off the burden to get everything
installed and going. In Europe, the European Molecular Network (EMBnet, 
see http://www.embnet.unibas.ch/embnet.news/vol1_1/item_9.html for 
a listing of last summer) will offer services on national basis for 
their scientists.  On Mainframes, you install typically a 'package' 
and add utilities to it; major commercial products are the widely used 
'GCG package' (Genetics Computer Group, Inc.) and 'IG Suite' (Intelli-
genetics , Inc.) or public domain ('STADEN' by Roger staden, Cambridge). 
Obviously, you cannot expect to get the commercial products on FTP servers
but hep is available by eMail if you are interested in purchases.

Network installations are huge in number (See Keith's list at 
http://golgi.harvard.edu/biopages.html), and grow permanently, there-
fore it might be difficult for you to follow the developments- described 
in the bionet.software.www newsgroup (a on-line WWW archive is maintained
in http://www.ch.embnet.org/bio-www/info.html). There are four main 
methods currently to access network resources: 
WWW "World Wide Web" is the famous network access protocol of today. Based 
      on the idea that you can link texts by attaching 'links' to individual
      words, paragraphs, images etc. the appearance to the user is quite 
      appealing. Best run with 'graphic' browsers on Workstations or PC/Mac
      connected to the internet, but 'text-mode' browsers are available 
      on many mainframe installations. The WWW allows very sphisticated
      searches of the annotation  of databases - to give an example, there
      are many servers now on the network which offer the Sequence Retrieval
      system (SRS) by T.Etzold on WWW (check out the current global status 
      at http://www.ch.embnet.org/srs/index.html). There are too many others
      to mention, see Keith's list as mentioned above for details.
GOPHER is still a working horse for many and is currently supported by 
      some sites. Similar to WWW, GOPHER is much less demanding in network
      resources and offers trivial search facilities as well, but lacks 
      mostly the ability to no multi-parameter input. 
ELECTRONIC MAIL is useful if this is the only access to electronic 
      networks for you. There are many services which are currently only 
      available on electronic Mail. Amos Bairoch has produced a list of 
      servers, the file is called 'serv_ema.txt' and available on 
      ftp://nic.switch.ch:/mirror/embnet-ch/info/MORE_INFO_ON_VARIOUS_ISSUES 
      including lots of other issues. 
SPECIAL-PURPOSE protocols gain growing importance as the dedicated question
      might require more sophisticated information exchange than the pre-
      viously listed approaches allow. The US leader in the field is the 
      NCBI with a sepcial service for BLAST and ENTREZ; the first being for 
      sequence searching, the latter for retrieval (also referenced as 
      'network blast' and 'netentrez'). More information for WWW users on 
      http://www.ncbi.nlm.nih.gov/ .Our site has contributed the Hierarchical
      Access System for Sequence Libraries in Europe (HASSLE), which is 
      a new protocol for various services and has built-in fault tolerance 
      and resource discovery. You might want to look up the following URL:
      http://beta.embnet.unibas.ch/basel/science/info.html (overview). 

Maybe this helps
Regards
Reinhard 
-- 
 R.Doelz         Klingelbergstr.70| Tel. x41 61 267 2247  Fax x41 61 267 2078|
 Biocomputing        CH 4056 Basel| electronic Mail    doelz at ubaclu.unibas.ch|
 Biozentrum der Universitaet Basel|-------------- Switzerland ---------------|
<a href=http://beta.embnet.unibas.ch/>EMBnet Switzerland:info at ch.embnet.org</a> 





More information about the Bio-www mailing list