IUBio

SRS-FTP gateway (beta release 1.1b)

Jeroen Coppieters jecop at ebi.ac.uk
Thu Mar 16 14:06:06 EST 1995


I sent this out a week ago, but due to local distribution problems, most of
the world never saw it.
For the people that did see the previous announcement,
an important new feature is added, so do read on.
Also people looking for non WWW access to the databases could be interested

Included is a short manual on the use of an FTP to SRS gateway, developed
in the EMBL Outstation - EBI. It can be used right now, but I do not 
garantee yet, that the service will be officially supported.
Please try it, and if you find any problems, or have any suggestions, let
me know (jecop at ebi.ac.uk)
The idea is that it can be used in automated checking/retrieving new
sequences of interest.

Jeroen

-
USING SRS (Sequence Retrieval System [1]) from Anonymous FTP
------------------------------------------------------------
(Jeroen Coppieters, 17-Mar-1995, srsftp version 1.1beta)

This document describes SRSFTP as it is maintained at 
EMBL Outstation, the European Bioinformatics Institute
further referenced as "the EBI"

The Anonymous FTP server of the EBI has a gateway to SRS,
developped by Jeroen Coppieters and RJ White.
This allows the retrieval of sequences from Swissprot and EMBL (as well as 
all other databases that are maintained in SRS at the EBI [2]), using
the power of SRS.
A query can be executed from any directory, after connecting to the
anonymous ftp server.

All results of the query will be stored in the file you name.
The sequences are stored in flat-file EMBL format by default

In any query, the * wildcard can be used

three formats of queries are available:
 - simple sequence retrieval
 - simple srs query
 - full srs query

1) SIMPLE SEQUENCE RETRIEVAL
---------------------------
FORMAT: 
get DB:INDEX:QUERYSTRING FILENAME

DB is one of the following:
embl, emblnew, emblall, nuc
swissprot, swissnew, swissall, pep
if any of the xxall databases is specified, both the release and the
updates are searched. If a sequence has been updated since the last
release, both the old and new entry will be returned.

INDEX  DATABASE SEARCHFIELD          QUERYSTRING
acc      accession number            accession nr (e.g. X07888)
id       identifier                  identifier (e.g. ATP6_YEAST)
dat      date                        date (e.g. 20-NOV-1994)
fts      feature                     feature name (e.g. intron)
ref      reference                   Journal reference
                                     (e.g. Plant Mol. Biol. 10:91-104(1987))
sl       sequence length             number (e.g. 2400)
                                     or range (e.g. 2300:5000)
def      definition                  string
aut      author                      string
cc       comment                     string
org      organism                    string
tit      reference title             string
all      all text fields             string

EXAMPLES:
get emblall:acc:x07888 x07888.seq
retrieve sequence from embl/emnew with accession number x07888
store in the file x07888.seq

get swissprot:all:nitrate* nitrate.pep
get all swissprot entries that have a word starting with nitrate
in any of the text fields.
Store in the file nitrate.pep


2) SIMPLE SRS QUERY
------------------
FORMAT:
get srs:QUERYSTRING

This allows linking of several databases

For more information on SRS queries, have a look at the SRS manual
This however does not allow the complete functionality of SRS.
Restrictions are: - NO SPACES are allowed in the query.
                  - no command line parameters can be included

EXAMPLES
get srs:[embl-fts:intron]>parent&[embl-org:arabidopsis*] arain.seq
retrieve all sequences from Arabidopsis spec. that contain an intron.

get srs:[prosite-id:PROTEIN_KINASE_TYR]>swissprot kinase.pep
retrieve all proteins, that contain a tyrosine kinase motif (PROSITE)

get srs:[prosite-id:PROTEIN_KINASE_TYR]>swissprot>pdb kinase.pdb
retrieve 3D structures (if known) from the above proteins

3) FULL SRS QUERY
-----------------
If you want to have access to command line options (e.g. to change the
output format), a full blown getz command is available.

FORMAT:


More information about the Embl-db mailing list

Send comments to us at biosci-help [At] net.bio.net