...getting out what I want...

Peter Rice pmr at sanger.ac.uk
Mon Apr 10 11:00:39 EST 1995

In article <1995Apr10.111606.18518 at reks.uia.ac.be> przemko at reks.uia.ac.be (Przemko) writes:
>   I am familiar with both versions of SRS: command line and HTTP.
>   During my work with it it became obvious that the features are not
>   identical and that, in fact, only combination of the two would allow
>   me to get what I want from that program.
>   The problem that I face is a simple one:
>   - get out all sequences that contain a signal peptide- only the
>     ACTUAL feature
>   - put it in GCG and do some things on it.
>   The troubles are (for command line interface)
>   - if I do something wrong (like asking FEATURE and PIR- it has no
>      feature field) the program mercilessly crashes (core dump etc.)
>   - I cannot do q1-organism and q2-feature
>   - I cannot do AllText search
>   - some other minor things
>   For the HTTP interface:
>   - I can do all above (except for the q1-organism and q2-feature)
>   - BUT I cannot save my results in a file. What I want is all my
>     seqs- FEATURES ONLY- as a file. One can do it in the command line
>     interface but not HTTP.
>   So, if there is anyone that can help me, please do not hesitate
>   to do so |:)

Well, the command line part is in the documentation. You can't use set
operators, but you can link the sets to get the result you want.
This should be enough to get you started.

I also was unable to use "alltext" or "all" from the command line.

:Entries and Subentries
:Sets originating from the same databank may have different set types.
:Consider the two queries: 
:   [swissprot-keywords: transmembrane]
:   [swissprot-features: transmem] 
:The first query retrieves all SwissProt entries that have transmembrane
:segments, the second finds all transmembrane features contained in
:SwissProt entries. The second query will retrieve many more entries
:since most transmembrane proteins have more than one membrane spanning
:segment. If you requested the sequences for entries in the second set
:you would get the transmembrane segments and not the parent entry's
:sequence. The first query returns a set of entries whereas the second
:returns a set of subentries. The "features" index has a special type:
:searches in that index will for all sequence databanks result in sets
:of subentries. 
:Sets of entries and subentries can not combined be with logical operators!
:Only the link operators may be used between them, ie, it is always
:possible to link subentries to their respective parent entries. 
:   [swissprot-org:human] > [swissprot-features:transmem] 
:      returns all transmembrane segments found in human proteins. 
:   [swissprot-org:human] < [swissprot-features:transmem] 
:      returns all human proteins that have transmembrane segments. 

And as for the HTTP part,that is trickier and I can't find a complete
solution to the problem.

I could use "QueryManager" to enter the query directly - as for the
command line but without the quotes (OK, it's a cheat :-)

But then it only seems to display complete entries and let you click on the
features one at a time.

Funny thing though - it seems not to care whether I put "organism"
or "features" first in a search. When I look at the query it always looks
the same: organism > features (but then displays the sequence of the
complete entries rather than the features :-)

Peter Rice                           | Informatics Division
E-mail: pmr at sanger.ac.uk             | The Sanger Centre
Tel: (44) 1223 494967                | Hinxton Hall, Hinxton,
Fax: (44) 1223 494919                | Cambs, CB10 1RQ
URL: http://www.sanger.ac.uk/~pmr/   | England

More information about the Bio-srs mailing list