Suggestions for SRS use w/ nonsequence gene data

Peter Rice pmr at sanger.ac.uk
Thu Aug 17 09:13:28 EST 1995


In article <40vfeu$jb3 at usenet.ucs.indiana.edu> gilbertd at sunflower.bio.indiana.edu (Don Gilbert) writes:
>   Here are some suggestions for SRS that have arisen from trying to use
>   it with Drosophila genome data:
>
>   1) indexer interface: needs to permit indexing of any character/symbol set.
>      Drosophila genes use just about the full ASCII printable symbol set
>      (and would use more if possible).

One for Thure, but I suspect your wish (and mine) will soon be granted :-)

>   2) query interface: needs to permit any character/symbol set to be valid
>      data in the query, and query symbols should be configurable. 
>
>      Use of words instead of symbols as query operators
>      should be optional at least, and by my preference they would be default.

I would be happy to just have escaping of critical characters like \(\)\&\|
but certainly something is needed.

>   3) output interface:  needs to allow addition of post-processor functions
>      to convert data to various human-usable formats.  This is done now in
>      part for sequence data and for adding html links, but not in a
>      general way that would allow addition output formatting per database
>      w/o rewriting the basic SRS code.
>
>      Here is roughly how I did it for flybase data, but it is a hack
>      not a general solution.  Example outputs show this formatted output 
>      from iubio server, versus the computer "star code" output from the 
>      sanger server.

This looks a really neat idea !!

Not only is it nice in the flybase cases, but it could also save me the
problem of converting many tab-delimited databases with perl scripts
into something with readable "entries" that I can parse and display.

Of course, first I need a parser that can handle tab-delimited databases :-)

On a related note, I would like to be able to put some extra text on the
">" line of FASTA format output - with the aim of using SRS to generate a
database subset for blast indexing and searching. Then (for example) the
accession number and definition could appear in the blast search results.

--
------------------------------------------------------------------------
Peter Rice                           | Informatics Division
E-mail: pmr at sanger.ac.uk             | The Sanger Centre
Tel: (44) 1223 494967                | Hinxton Hall, Hinxton,
Fax: (44) 1223 494919                | Cambs, CB10 1RQ
URL: http://www.sanger.ac.uk/~pmr/   | England




More information about the Bio-srs mailing list