getz syntax question

mathog at seqaxp.bio.caltech.edu mathog at seqaxp.bio.caltech.edu
Wed Dec 2 13:33:20 EST 1998


SRS 5.1 on Linux/Intel.

Can somebody please explain to me how to get from the query description
language in the html page to the following observed behavior?   Yes, this
started out as a typo, but since it doesn't throw an error, it is 
apparently allowed syntax.

All of these rapidly return a list of all SWISSPROT entries:

  % getz 'swissprot-id:ha12_mouse'
  % getz 'swissprot-id'
  % getz 'swissprot'

but

  % getz 'swiss'
  SRSICA:srsquery.i:51:  error: unknown set or databank. "swiss"

It looks like all databases are predefined as "sets", and also getz will
return a "set" even if no operators are present on the query line.  However,
how the parser handles the first two commands above is very unclear.  Also,
this doesn't do what I'd expect if the preceding was the full explanation: 

  % getz '{nrl3d swissprot}'

since it returns NOTHING, why not both?  Elsewhere {database database} can
be used anywhere that "database" alone can.

  % getz 'nrl3d | swissprot'

returns both sets, swissprot first.  So the logical operators function
in this context, but not the {} operator.

One interesting side note, the "obvious" methods of listing a database take
orders of magnitude longer than the methods described above. On a 400 Mhz
PII, all SRS databases on a single 9 Gb U2W scsi disk, no other load on
system: 

  % getz 'swissprot' > /dev/null 

took 3 seconds, but

  % getz '[swissprot-id:*]' > /dev/null

took 423 seconds.

(The fastest method of all to get this information was 

  % cat swissprot.seqcat > /dev/null

which took lesss than a second, but not all sites will have that GCG file.)

Regards,

David Mathog
mathog at seqaxp.bio.caltech.edu
Manager, sequence analysis facility, biology division, Caltech 




More information about the Bio-srs mailing list