EMBL forthcoming changes

Weimin Zhu weimin at ebi.ac.uk
Wed Dec 15 10:05:50 EST 2004


Dear friends,

>From EMBL release 81 and on, we are going to announce the forthcoming
changes in EMBL databank.

Continuous ranges of secondary accessions

With the removal of sequence length limits, some genomes (typically
bacterial) that had been split into many pieces are being replaced by a
single sequence record.

When this happens, the accessions of the former small pieces become
secondary accessions for the single large sequence record. When each
secondary is separately listed, the AC line becomes excessively lengthy.

Same thing can happen when the a WGS set is finished and moved into the
main section of the database.

- Secondary accession number ranges in AC line

Starting from next release, consecutive secondary accession numbers in
EMBL database flatfiles will be shown in the form of accession number
ranges

Example

AC line that now appears:
AC   Y00001; X00001; X00002; X00003; X00004; X00005;

will appear:
AC   Y00001; X00001-X00005;

A mixture of ranges and single accession numbers will be possible.
AC   Y00001; X00001-X00005; X00008; Z00001-Z00005;

The first item in the AC line is the primary accession number; the
primary accession number of a given entry will not be displayed as a
part of a
range.

Note: lists of accession numbers will continue to be syntactically legal
in EMBL flatfiles.

Regards,
Weimin Zhu & Siamak Sobhany
EMBL Nucleotide Sequence Database
EMBL-EBI





More information about the Embl-db mailing list