IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

[Genbank-bb] GenBank Release 199.0 Problem : Three records with corrupted CONTIG lines

Cavanaugh, Mark (NIH/NLM/NCBI) [E] via genbankb%40net.bio.net (by cavanaug from ncbi.nlm.nih.gov)
Wed Dec 18 14:22:09 EST 2013

Greetings GenBank Users,

The gbcon229.seq.gz and gbcon230.seq.gz files of GenBank Release 199.0
contained a total of three records with corrupted CONTIG-line contents:

con229 : KI543240
con230 : KI629878 and KI629953

The original sizes and timestamps of the affected files were:

-r--r--r--   1 ftp      anonymous 20119743 Dec 13 21:13 gbcon229.seq.gz
-r--r--r--   1 ftp      anonymous  9763733 Dec 13 21:13 gbcon230.seq.gz

Here are selected portions of the records which demonstrate the nature
of the problem:

LOCUS       KI543240                5276 bp    DNA     linear   CON 14-NOV-2013
DEFINITION  Thalassobacillus devorans MSP14 genomic scaffold scaffold00007,
            whole genome shotgun sequence.
ACCESSION   KI543240 AWXW01000000
VERSION     KI543240.1  GI:557884536
CONTIG      join(I from V034512:1..1596,gap(720),<F6>#2066515663.1:1..2960)

LOCUS       KI629878               42039 bp    DNA     linear   CON 04-DEC-2013
DEFINITION  Porphyromonas gingivalis SJD2 genomic scaffold scaffold15, whole
            genome shotgun sequence.
ACCESSION   KI629878 ASYL01000000
VERSION     KI629878.1  GI:563396947
CONTIG      join(gi|563396143:1..42039)

LOCUS       KI629953                4352 bp    DNA     linear   CON 04-DEC-2013
DEFINITION  Porphyromonas gingivalis SJD2 genomic scaffold scaffold83, whole
            genome shotgun sequence.
ACCESSION   KI629953 ASYL01000000
VERSION     KI629953.1  GI:563396869
CONTIG      join(NZ_LPVH116484131.1:1..4352)

As of approximately 2:08pm EST on Wednesday December 18 2013, the release
files containing these records were patched and reinstalled at the NCBI FTP

-r--r--r--   1 ftp      anonymous 20119739 Dec 18 19:08 gbcon229.seq.gz
-r--r--r--   1 ftp      anonymous  9763701 Dec 18 19:08 gbcon230.seq.gz

The ASN.1 version of GenBank 199.0 was not affected.

The cause of this problem is the same as the one which impacted the contents
of the October 16th 2013 CON-division GenBank Incremental Update. Please
refer to the GenBank newsgroup post of October 18th for further details.

As was the case for the incremental update problem, we once again have 
several users at Chemical Abstracts Services (www.cas.org) to thank for
detecting the problem in the GenBank 199.0 release files. We appreciate the
scrutiny of GenBank data products that our users provide, and appreciate
problem reports.

Our apologies for any inconvenience that this may have caused.

Mark Cavanaugh

More information about the Genbankb mailing list

Send comments to us at biosci-help [At] net.bio.net