PIR entries give bad sequence formats

Peter Rice pmr at sanger.ac.uk
Thu Nov 4 12:11:52 EST 1999


Has anyone changed their PIR parser to produce 'correct' sequence
formats in SRS5 for the wgetz output?

With the default (PIR) format there are two headers - one for the
reference data and one for the sequence, so this is not acceptable PIR format.

http://www.sanger.ac.uk/srs5bin/cgi-bin/wgetz?-e+[PIR-ID:'S10602']

With GCG format some entries are fine, but others (e.g. S10602) have
".." hiding in the reference part and SRS does not expand it.

http://www.sanger.ac.uk/srs5bin/cgi-bin/wgetz?-e+[PIR-ID:'S10602']+-sf+gcg

I am trying to find a format that allows EMBOSS to read PIR entries
from an SRSWWW server.

With "-f seq - sf <format>" there is an extra header, but I suppose
"-f seq -sf gcg" will make the sequence readable (GCG format wil
ignore the header, while SRS is no longer writing the ".." line) while
losing most other information. Hardly ideal.

http://www.sanger.ac.uk/srs5bin/cgi-bin/wgetz?[PIR-ID:'S10602']+-f+seq+-sf+gcg

-- 
----------------------------------------------------------------------
Peter Rice                | Informatics Division, The Sanger Centre,
E-mail: pmr at sanger.ac.uk  | Wellcome Trust Genome Campus,
Tel: (44) 1223 494967     | Hinxton, Cambridge, CB10 1SA, England
Fax: (44) 1223 494919     | URL: http://www.sanger.ac.uk/Users/pmr/





More information about the Bio-srs mailing list