I sent this out a week ago, but due to local distribution problems, most of
the world never saw it.
For the people that did see the previous announcement,
an important new feature is added, so do read on.
Also people looking for non WWW access to the databases could be interested
Included is a short manual on the use of an FTP to SRS gateway, developed
in the EMBL Outstation - EBI. It can be used right now, but I do not
garantee yet, that the service will be officially supported.
Please try it, and if you find any problems, or have any suggestions, let
me know (jecop at ebi.ac.uk)
The idea is that it can be used in automated checking/retrieving new
sequences of interest.
USING SRS (Sequence Retrieval System ) from Anonymous FTP
(Jeroen Coppieters, 17-Mar-1995, srsftp version 1.1beta)
This document describes SRSFTP as it is maintained at
EMBL Outstation, the European Bioinformatics Institute
further referenced as "the EBI"
The Anonymous FTP server of the EBI has a gateway to SRS,
developped by Jeroen Coppieters and RJ White.
This allows the retrieval of sequences from Swissprot and EMBL (as well as
all other databases that are maintained in SRS at the EBI ), using
the power of SRS.
A query can be executed from any directory, after connecting to the
anonymous ftp server.
All results of the query will be stored in the file you name.
The sequences are stored in flat-file EMBL format by default
In any query, the * wildcard can be used
three formats of queries are available:
- simple sequence retrieval
- simple srs query
- full srs query
1) SIMPLE SEQUENCE RETRIEVAL
get DB:INDEX:QUERYSTRING FILENAME
DB is one of the following:
embl, emblnew, emblall, nuc
swissprot, swissnew, swissall, pep
if any of the xxall databases is specified, both the release and the
updates are searched. If a sequence has been updated since the last
release, both the old and new entry will be returned.
INDEX DATABASE SEARCHFIELD QUERYSTRING
acc accession number accession nr (e.g. X07888)
id identifier identifier (e.g. ATP6_YEAST)
dat date date (e.g. 20-NOV-1994)
fts feature feature name (e.g. intron)
ref reference Journal reference
(e.g. Plant Mol. Biol. 10:91-104(1987))
sl sequence length number (e.g. 2400)
or range (e.g. 2300:5000)
def definition string
aut author string
cc comment string
org organism string
tit reference title string
all all text fields string
get emblall:acc:x07888 x07888.seq
retrieve sequence from embl/emnew with accession number x07888
store in the file x07888.seq
get swissprot:all:nitrate* nitrate.pep
get all swissprot entries that have a word starting with nitrate
in any of the text fields.
Store in the file nitrate.pep
2) SIMPLE SRS QUERY
This allows linking of several databases
For more information on SRS queries, have a look at the SRS manual
This however does not allow the complete functionality of SRS.
Restrictions are: - NO SPACES are allowed in the query.
- no command line parameters can be included
get srs:[embl-fts:intron]>parent&[embl-org:arabidopsis*] arain.seq
retrieve all sequences from Arabidopsis spec. that contain an intron.
get srs:[prosite-id:PROTEIN_KINASE_TYR]>swissprot kinase.pep
retrieve all proteins, that contain a tyrosine kinase motif (PROSITE)
get srs:[prosite-id:PROTEIN_KINASE_TYR]>swissprot>pdb kinase.pdb
retrieve 3D structures (if known) from the above proteins
3) FULL SRS QUERY
If you want to have access to command line options (e.g. to change the
output format), a full blown getz command is available.