Greetings GenBank Users,
The flatfile version of the "CON division" GenBank Incremental Update
(GIU) product for October 16 2013 contained corrupted CONTIG lines.
The affected file was con_nc.1016.flat.gz :
ftp> pwd
257 "/genbank/daily-nc" is the current directory
ftp> dir con*1016*
227 Entering Passive Mode (130,14,29,30,201,94)
150 Opening ASCII mode data connection for file list
-r--r--r-- 1 ftp anonymous 372433 Oct 16 06:03 con_nc.1016.flat.gz
Here is a portion of one of the impacted records:
LOCUS HF677448 31135 bp DNA linear CON 14-OCT-2013
DEFINITION Clostridium difficile T5 genomic scaffold, 1852_175, whole genome
shotgun sequence.
ACCESSION HF677448 CAMB01000000
VERSION HF677448.1 GI:549396745
DBLINK BioProject: PRJEB188
....
FEATURES Location/Qualifiers
source 1..31135
/organism="Clostridium difficile T5"
/mol_type="genomic DNA"
/strain="T5"
/db_xref="taxon:1215059"
/note="1852_175"
CONTIG join(<C8>#.1:1..31135)
//
A total of 27 records were affected. Other examples of mangled
CONTIG lines are:
CONTIG join(È#.1:1..31135)
CONTIG join(È#.1:1..74315)
CONTIG join(Ô,†â#36472546.1:1..1743)
CONTIG join(D#.1:1..18776)
CONTIG join(I from V034512:1..7477)
CONTIG join(,†Ô#16940.55:1..1610)
CONTIG join(NZ_ from EFR210166874.1:1..2732)
CONTIG join(Ö›#25900.1:1..1769)
CONTIG join(,†Ô#16940.55:1..8158)
CONTIG join(NZ_ from EFR210166874.1:1..56047)
CONTIG join(1,†éb#,†éc#1697416937.1:1..1641)
CONTIG join(†ë‚#É,†ëƒ#7,†ë„#.1:1..31477)
CONTIG join(÷#.1:1..29020)
CONTIG join(û,†ÚÍ#ý,†ÚÎ#$,†ÚÏ#523011802.1:1..1118)
CONTIG join(,†Ô#16940.55:1..9107)
CONTIG join(È#.1:1..22095)
CONTIG join(gi|549282866:1..7476)
CONTIG join(1,†éb#,†éc#1697416937.1:1..7155)
CONTIG join(I from V034512:1..23828)
CONTIG join(Ô,†â#36472546.1:1..1432)
CONTIG join(1,†éb#,†éc#1697416937.1:1..4334)
CONTIG join(û,†ÚÍ#ý,†ÚÎ#$,†ÚÏ#523011802.1:1..26661)
CONTIG join(ö#2066515663.1:1..29717)
CONTIG join(,†Ô#16940.55:1..42966)
CONTIG join(÷#.1:1..8802)
CONTIG join(È#.1:1..1625)
CONTIG join(1,†éb#,†éc#1697416937.1:1..628)
The ASN.1 version of the 1016 CON-division GIU was not affected.
An unstable system, unmonitored during the recent United States
government shutdown, was responsible for this problem. To address
it, we ensured that today's CON-division products contain all of
the records that had been present in the October 16th CON-division GIU:
-r--r--r-- 1 ftp anonymous 1515351 Oct 18 05:36 con_nc.1018.flat.gz
And we have confirmed that all of the CONTIG lines are correct.
So this means that the users may safely skip processing of
con_nc.1016.flat.gz (if they haven't already), and proceed
with the 1017 and 1018 data products.
To prevent problems for others who may not yet have obtained
the 1016 CON-division GIU products, we have just removed them
from our FTP site (both flatfile and ASN.1 versions).
We would like to thank GenBank users at Chemical Abstracts Services
(www.cas.org) for alerting us to this problem. We appreciate the
scrutiny of the GIU that our users provide, and appreciate problem
reports.
Our apologies for any inconvenience that this may have caused.
Mark Cavanaugh
GenBank
NCBI/NLM/NIH/HHS