Unresolved problem in SRS5
jackl at caos.kun.nl
Mon May 12 08:37:31 EST 1997
I don't want to spoil the overall enthousiasm over the availability
of SRS5, but it appears strange to me that nobody seems to care about
the fact the SRS5 fails to handle (GCG-) split entries correctly!!!
Just try the following on your server, if you use the EMBL and EMNEW
database in GCG format: retrieve entry "CEY57G11" from EMNEW... This
entry is split into 5 separate entries by GCG, and should be joined
by SRS into one. This DOES happen indeed, but with an INCORRECT num-
ber of base, due to the fact the overlap fragments are not removed!
I tried a few sites, searching for ID=CEY57G11, the original length
of which is 486290 bases.
version - format - got ID -- with length
EMBL: SRS5.04 -- EMBL -- CEY57g11 -- 486290 bases
EBI: SRS5.05 -- GCG -- CEY57G11 -- 526290 bases
UPPSALA: SRS5.03 -- GCG? -- CEZK1127 -- n.a.
This was reported some time ago, but it hasn't been resolved even
with the current release! one solution would be to store the database
in original EMBL format, but this is hardly acceptable, given the
size of the data.
Jack A.M. Leunissen | Email: jackl at caos.kun.nl
CAOS/CAMM Center | Tel. : +31 24 365 22 48
University of Nijmegen | Fax : +31 24 365 29 77
Nijmegen, The Netherlands | Www : http://www.caos.kun.nl
More information about the Bio-srs