IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

[Genbank-bb] GenBank Update Problem : 1125 : Incomplete WGS and TSA flatfiles

Cavanaugh, Mark (NIH/NLM/NCBI) [E] via genbankb%40net.bio.net (by cavanaug from ncbi.nlm.nih.gov)
Tue Dec 1 16:25:45 EST 2015


Greetings GenBank Users,

Due to a software deployment problem, the GenBank flatfile versions
of 198 WGS projects and 1 TSA project which were made available in
NCBI's genbank/wgs and genbank/tsa FTP areas on November 25 2015
lacked reference and comment data.

Here is an excerpt from one of the impacted records:

ftp> pwd
257 "/genbank/wgs" is the current directory

ftp> dir wgs.AOTC.1.gbff.gz
227 Entering Passive Mode (130,14,29,30,196,130)
150 Opening ASCII mode data connection for file list
-r--r--r--   1 ftp      anonymous  2235956 Nov 25 08:19 wgs.AOTC.1.gbff.gz

After gunzip'ing this file, the first record looks like so:

LOCUS       AOTC01000001             482 bp    DNA     linear   BCT 24-NOV-2015
DEFINITION  Escherichia coli str. Deng scaffold157-size482, whole genome
            shotgun sequence.
ACCESSION   AOTC01000001 AOTC01000000
VERSION     AOTC01000001.1  GI:954182075
KEYWORDS    WGS.
SOURCE      Escherichia coli str. Deng
  ORGANISM  Escherichia coli str. Deng
            Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales;
            Enterobacteriaceae; Escherichia.
FEATURES             Location/Qualifiers
     source          1..482
                     /organism="Escherichia coli str. Deng"

But it should actually be:

LOCUS       AOTC01000001             482 bp    DNA     linear   BCT 24-NOV-2015
DEFINITION  Escherichia coli str. Deng scaffold157-size482, whole genome
            shotgun sequence.
ACCESSION   AOTC01000001 AOTC01000000
VERSION     AOTC01000001.1  GI:954182075
DBLINK      BioProject: PRJNA186398
            BioSample: SAMN04285713
KEYWORDS    WGS.
SOURCE      Escherichia coli str. Deng
  ORGANISM  Escherichia coli str. Deng
            Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales;
            Enterobacteriaceae; Escherichia.
REFERENCE   1  (bases 1 to 482)
  AUTHORS   Yu,Z., Chen,Z., Ma,G., Zheng,J., Fan,X., Pan,W. and Zeng,Z.
  TITLE     Sequence and pathogenicity-island genes analysis of Escherichia
            coli Deng
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 482)
  AUTHORS   Yu,Z., Chen,Z., Ma,G., Zheng,J., Fan,X., Pan,W. and Zeng,Z.
  TITLE     Direct Submission
  JOURNAL   Submitted (18-FEB-2013) Department of Infectious Diseases, The
            Affiliated Shenzhen Nanshan Hospital of Guangdong Medical College,
            89 Taoyuan Road, Nanshan District, Shenzhen, Guangdong 518052,
            China
COMMENT     Bacteria and source DNA available from: Zhijian Yu or Qiwen Deng,
            Department of Infectious Disease, Nanshan Hospital, 89 Taoyuan Rd.,
            Nanshan District, Shenzhen, Guangdong, China.
            
            ##Genome-Assembly-Data-START##
            Assembly Method       :: Velvet v. 1.2.08
            Genome Coverage       :: 600.0x
            Sequencing Technology :: Illumina HiSeq
            ##Genome-Assembly-Data-END##
FEATURES             Location/Qualifiers
     source          1..482
                     /organism="Escherichia coli str. Deng"

A list of the project codes that are impacted is provided below. They
can also be found in the project lists for the date in question:

    genbank/wgs/proj_list.2015.1125
    genbank/tsa/tsa.proj_list.2015.1125

The ASN.1 data files for these projects were not impacted.

All 199 projects are being reprocessed now to fix this problem. We expect
the job to complete by early morning on December 2 2015 (EST).

We would like to thank our colleagues at the DNA Data Bank of Japan (DDBJ)
for alerting us to this problem. We appreciate the scrutiny of the WGS
and TSA products that our users provide, and appreciate all problem
reports.

Our apologies for any inconvenience that this may have caused.

Mark Cavanaugh
GenBank
NCBI/NLM/NIH/HHS

AOTC
ATNN
ATNO
ATNP
ATNQ
AXCY
BCFJ
CXPF
CXPH
CXPI
CXPJ
CXPK
CXPL
CXPM
CXPN
CXPO
CXPP
CXPQ
CXPR
CXPS
CXPT
CXPU
CXPV
CXPW
CXPX
CXPY
CXPZ
CXQA
CXQB
CXQC
CXQD
CXQE
CXQF
CXQG
CXQH
CXQI
CXQJ
CXQK
CXQL
CXQM
CXQN
CXQO
CXQP
CXQQ
CXQR
CXQS
CXQT
CXQU
CXQV
CXQW
CXQX
CXQY
CXQZ
CXRA
CXRB
CXRC
CXRD
CXRE
CXRF
CXRG
CXRH
CXRI
CXRJ
CXRK
CXRL
CXRM
CXRN
CXRO
CXRP
CXRQ
CXRR
CXRS
CXRT
CXRU
CXRV
CXRW
CXRX
CXRY
CXRZ
CXSA
CXSB
CXSC
CXSD
CXSE
CXSF
CXSG
CXSH
CXSI
CXSJ
CXSK
CXSL
CXSM
CXSN
CXSO
CXSP
CXSQ
CXSR
CXSS
CYSS
CYXF
GBUH
JNCC
JPWU
JPWV
JTDN
JYDH
JYDI
JYDJ
JYDK
JYDL
JYDM
JYDN
JYDO
JYDP
JYDQ
JYDR
JYDS
JYDT
JYDU
JYDV
JYDW
LAQT
LATH
LDAU
LDOG
LDOH
LDOI
LDOJ
LDOK
LDOL
LDOM
LDON
LDPK
LDPL
LEKH
LFGY
LFJN
LFXR
LFXS
LGAX
LGCC
LGCD
LGCI
LGCK
LGCL
LGCM
LGIR
LGIS
LGIT
LGIU
LGJE
LGKK
LGKN
LGKO
LGKP
LGSJ
LGSK
LGSL
LGSM
LGSN
LGSO
LGTR
LGTS
LGTT
LGTU
LGYP
LGYW
LHOX
LHOY
LHOZ
LHPA
LHPB
LHPC
LHQM
LIFT
LIFU
LIFV
LIFW
LIFX
LIFY
LIGE
LIGJ
LIHB
LIIM
LIRF
LITK
LIVN
LIXZ
LIYB
LJAS
LJAT
LJDF
LJFJ
LJJG
LJSC
LJSD
LJYW
LJZP
LKCW





More information about the Genbankb mailing list

Send comments to us at biosci-help [At] net.bio.net