Processing of NCBI Backbone sequence data at EMBL

huie at embl-heidelberg.de huie at embl-heidelberg.de
Mon May 3 08:27:01 EST 1993


We will now include sequence data from the NCBI Backbone Database in the EMBL
Nucleotide Sequence Database.

The NCBI journal-scanning activity collects sequence data as published in
scientific journals and builds the NCBI Backbone Database.
However, most data these days is electronically submitted directly to one of
the database groups at DDBJ, EMBL or GenBank. We have been careful to avoid
data redundancy caused by adding data from the Backbone database for which their
is a corresponding direct submission.

Backbone data passes through a matching algorithm which uses combinations of
the following criteria:
   .  the accession number which is allocated to the data if it had already been
      directly submitted. If this is cited in the journal article, we find
      it in the Backbone entry.
   .  same author name(s)
   .  same organism
   .  sequence similarity

Backbone data which is determined to be novel or "unmatched" will be included
in the EMBL Nucleotide Sequence Database, using the Backbone accession
number (typically beginning with the letter 'S' at the moment), and with a
dataclass of "backbone" indicated on the ID-line of the EMBL flat-file entry.
Users may notice a different style of description text on the DE-line and
the relative lack of annotation detail.
The matching algorithm will be periodically re-applied to such data to catch
the cases where a direct submission becomes available after a corresponding
Backbone entry.

Backbone data which already has a matching direct submission will not be
included in the EMBL database proper, but will be retrievable via its Backbone
database accession number via the EMBL e-mail file server. Other options for
access will be considered.



More information about the Embl-db mailing list