From owner-genbankb@hgmp.mrc.ac.uk  Fri Mar  3 03:49:59 2000
Return-Path: <owner-genbankb@hgmp.mrc.ac.uk>
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 110)
	id A4C5E17B05; Fri,  3 Mar 2000 03:49:58 +0000 (GMT)
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 1)
	id 2B4D417AF0; Fri,  3 Mar 2000 03:49:56 +0000 (GMT)
To: genbank@net.bio.net
Newsgroups: bionet.molbio.genbank
Date: 3 Mar 2000 03:41:30 -0000
From: Mark Cavanaugh <cavanaug@lagrange.nlm.nih.gov>
Subject: GenBank Release 116.0 Available
Message-Id: <20000303034956.2B4D417AF0@mercury.hgmp.mrc.ac.uk>
Sender: owner-genbankb@hgmp.mrc.ac.uk
Precedence: bulk

  GenBank Release 116.0 is now available via ftp from the National Center
for Biotechnology Information:

  Ftp Site           Directory   Contents
  ----------------   ---------   ---------------------------------------
  ncbi.nlm.nih.gov   genbank     GenBank Release 116.0 flatfiles
                     ncbi-asn1   ASN.1 data used to create Release 116.0

  Uncompressed, the Release 116.0 flatfiles require roughly 21350 MB
(sequence files only) or 23300 MB (including the 'index' files). The
ASN.1 version requires roughly 17967 MB. From the release notes:

   Release  Date       Base Pairs   Entries

   115      Dec 1999   4653932745   5354511
   116      Feb 2000   5805414935   5691170

  In the nine-week period between close-of-data for GenBank 115.0 and
GenBank 116.0, GenBank grew by a record 1.151 billion basepairs.

  Close-of-data was 02/18/2000. Thirteen days were required to prepare this
release. An unfortunate combination of events (production problems through
the second week of February, followed by machine and disk crashes) has
delayed 116.0 by two weeks. But the release date remains "February 15",
chiefly as a convenience for recording growth stats.

 PLEASE NOTE: The author-name index file (gbaut.idx) for GenBank 116.0 will
not be available until the middle of next week, due to software problems that
arose today. Rather than delay GB 116.0 even further, we are making the
release available without the index file. Our apologies for any inconvenience
that this causes.

 For additional release information, see the README files in either of the
directories mentioned above, and the release notes (gbrel.txt) in the
genbank directory. Sections 1.3 and 1.4 of the release notes (Changes in
Release 116.0 and Upcoming Changes) have been appended below.

  Release 116.0 data are currently available via NCBI's Entrez and Blast
servers, and the 'query' email server.

  New GenBank cumulative update files (gbcu.flat.Z and gbcu.aso.Z), containing
only those entries new/updated since the Release 116.0 close-of-data, should be
available by 6:00am EST, March 3. Please note that the new CUs will be
smaller than previous versions you might have obtained after Release 115.0 was
posted.

  If you encounter problems while ftp'ing or uncompressing Release 116.0,
please send email outlining your difficulties to info@ncbi.nlm.nih.gov .

Mark Cavanaugh
GenBank
NCBI/NLM/NIH


1.3 Important Changes in Release 116.0

1.3.1 Organizational changes

  Due to database growth, the EST division is now being split into forty-seven
pieces.

  Due to database growth, the GSS division is now being split into sixteen
pieces.

  Due to database growth, the HTG division is now being split into fourteen
pieces.

  Due to database growth, the PRI division is now being split into five pieces.

1.3.2 Replacement of organelle-related qualifiers with /organelle

  Starting with GenBank Release 116.0 (February, 2000), the various organelle-related
qualifiers (/mitochondrion, /chromoplast, /chloroplast, /kinetoplast, etc) have been
been incorporated into a single new qualifier, with a controlled value format. The
definition of this qualifier is as follows:

Qualifier	/organelle=""

Definition	type of membrane-bound intracellular structure from 	
		which the sequence was obtained 

Value format	mitochondrion, nucleomorph, plastid, mitochondrion:kinetoplast,
                plastid:chloroplast, plastid:apicoplast, plastid:chromoplast, 
                plastid:cyanelle, plastid:leucoplast, plastid:proplastid

Examples        /organelle="mitochondrion"
                /organelle="nucleomorph"
                /organelle="plastid"
                /organelle="mitochondrion:kinetoplast"
                /organelle="plastid:chloroplast"
                /organelle="plastid:apicoplast"
                /organelle="plastid:chromoplast"
                /organelle="plastid:cyanelle"
                /organelle="plastid:leucoplast"
                /organelle="plastid:proplastid"

Comments	modifier text limited to values from controlled list

Reminder: complete Feature Table documentation is available at this URL:

	http://www.ncbi.nlm.nih.gov/collab/FT/index.html

1.4 Upcoming Changes

1.4.1 Selenocysteine representation

  Selenocysteine residues within the protein translations of coding
region features have been represented in GenBank via the letter 'X'
and a /transl_except qualifier. At the May collaborative meeting, it
was learned that IUPAC plans to adopt the letter 'U' for selenocysteine.

  DDBJ, EMBL, and GenBank will thus use this new amino acid abbreviation
for its /translation qualifiers. Although a timetable for its appearance
has not been finalized, we are mentioning this now because the introduction
of a new residue abbreviation is a fairly fundamental change.

  If conversion efforts go smoothly, the new selenocysteine abbreviation
will appear within a few months. Details about the use of 'U' will be made
available via these release notes and the GenBank newsgroup as they become
available.

1.4.2 Mutation and Allele features to be discontinued

  Agreement was reached at the May 1999 collaborative DDBJ/EMBL/GenBank
meeting that the functionality provided by the variation, mutation, and
allele features could be represented by just a single feature, variation.
Submittors of sequence data are now being encouraged to use just the
variation feature. With GenBank Release 117.0, all existing mutation and
allele features will be converted to variation, and then mutation and
allele features will no longer be legal feature keys.

1.4.3 VRL division will be split into multiple files

  The viral GenBank division (gbvrl.seq) will soon be split into multiple
files, since its size is approaching 300MB. This is likely to occur by
GenBank Release 117.0 (April 2000). The resulting files for VRL will be:
gbvrl1.seq and gbvrl2.seq .

1.4.4 New REFERENCE type for on-line journals

  Agreement was reached at the May 1999 collaborative DDBJ/EMBL/GenBank
meeting that an effort should be made to accomodate references which are
published only on-line. Until specifications for such references are
available from library organizations, GenBank will present them in a manner
like this:

	REFERENCE   1  (bases 1 to 2858)
	  AUTHORS   Smith, J.
	  TITLE     Cloning and expression of a phospholipase gene
	  JOURNAL   Online Publication
	  REMARK    Online-Journal-name; Article Identifier; URL

  This format is still tentative; additional information about this new
reference type will be made available via these release notes.
---



- gttaacaattaaagagtgtttatcgaaattcattatatagtggtttatatagaccacttc
-
- GenBank newsgroup see: http://www.bio.net/hypermail/genbankb/       
- GENBANKB e-mail: messages sent to genbankb@net.bio.net
- subscribe: e-mail biosci-server@net.bio.net with: subscribe genbankb
- unsub: e-mail biosci-server@net.bio.net with: unsubscribe genbankb      
- GenBank on the WWW, see:  http://www.ncbi.nlm.nih.gov/Genbank/
- problems with GENBANKB? E-mail moderator: francis@cmmt.ubc.ca                  





From owner-genbankb@hgmp.mrc.ac.uk  Mon Mar  6 23:35:24 2000
Return-Path: <owner-genbankb@hgmp.mrc.ac.uk>
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 110)
	id D5DB717ACA; Mon,  6 Mar 2000 23:35:23 +0000 (GMT)
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 1)
	id 13EEB17AC1; Mon,  6 Mar 2000 23:35:21 +0000 (GMT)
To: genbank@net.bio.net
Newsgroups: bionet.molbio.genbank
Date: 6 Mar 2000 21:11:40 -0000
From: Mark Cavanaugh <cavanaug@lagrange.nlm.nih.gov>
Subject: GB 116.0 : Corrupt ASN.1 file : gss1.aso.Z
Message-Id: <20000306233521.13EEB17AC1@mercury.hgmp.mrc.ac.uk>
Sender: owner-genbankb@hgmp.mrc.ac.uk
Precedence: bulk

The compressed version of one of the ASN.1 files
used to build GenBank 116.0 was corrupted during
a disk crash.

No indication of failure was returned during the
file's compression; so it was assumed to be intact
and was installed on NCBI's ftp site last week.

The problem manifests itself if you run NCBI's ASN.1
syntax checker on the file. EG:

  zcat gss1.aso.Z | asntool -m /some/path/to/asn.all
  -d stdin -t Bioseq-set

  [asntool] FATAL ERROR: stdinInput
  Bioseq-set.seq-set.E.seq.descr.E.<comment>
  Expected 00 after tag for comment
  [asntool] FATAL ERROR: stdinInput
  Bioseq-set.seq-set.E.seq.descr.E.<comment>
  Unable to match element in .E.

We replaced the corrupted gss1.aso.Z with a new
version at 4:10pm EST on Monday, March 6.

Our apologies for any inconvenience that this
might have caused.

Mark Cavanaugh
GenBank
NCBI/NLM/NIH


---



- gttaacaattaaagagtgtttatcgaaattcattatatagtggtttatatagaccacttc
-
- GenBank newsgroup see: http://www.bio.net/hypermail/genbankb/       
- GENBANKB e-mail: messages sent to genbankb@net.bio.net
- subscribe: e-mail biosci-server@net.bio.net with: subscribe genbankb
- unsub: e-mail biosci-server@net.bio.net with: unsubscribe genbankb      
- GenBank on the WWW, see:  http://www.ncbi.nlm.nih.gov/Genbank/
- problems with GENBANKB? E-mail moderator: francis@cmmt.ubc.ca                  





From owner-genbankb@hgmp.mrc.ac.uk  Thu Mar  9 00:31:08 2000
Return-Path: <owner-genbankb@hgmp.mrc.ac.uk>
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 110)
	id 5C75117B2C; Thu,  9 Mar 2000 00:31:07 +0000 (GMT)
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 1)
	id F40A417B24; Thu,  9 Mar 2000 00:31:04 +0000 (GMT)
To: genbank@net.bio.net
Newsgroups: bionet.molbio.genbank
Date: Wed, 8 Mar 2000 16:21:40 -0500 (EST)
From: Tatiana Tatusov <tatiana@azalea.nlm.nih.gov>
Subject: Chlamydia complete genomes
Message-Id: <20000309003104.F40A417B24@mercury.hgmp.mrc.ac.uk>
Sender: owner-genbankb@hgmp.mrc.ac.uk
Precedence: bulk

Complete genomes of Chlamydia muridarum (Chlamydia trachomatis MoPn) and 
Chlamydophila pneumoniae AR39 (synonym: Chlamydia pneumoniae AR39)
have been published.

  AUTHORS   Read,T.D., Brunham,R.C., Shen,C., Gill,S.R., Heidelberg,J.F.,
            White,O., Hickey,E.K., Peterson,J., Utterback,T., Berry,K.,
            Bass,S., Linher,K., Weidman,J., Khouri,H., Craven,B., Bowman,C.,
            Dodson,R., Gwinn,M., Nelson,W., DeBoy,R., Kolonay,J., McClarty,G.,
            Salzberg,S.L., Eisen,J. and Fraser,C.M.
  TITLE     Genome sequences of Chlamydia trachomatis MoPn and C. pneumoniae
            AR39
  JOURNAL   Nucleic Acids Res. 28, 1397-1406 (2000)


Sequence data can be found at NCBI ftp site:

ftp://ncbi.nlm.nih.gov/genbank/genomes/bacteria/CtraM/
ftp://ncbi.nlm.nih.gov/genbank/genomes/bacteria/CpneuA


---------------------
Tatiana Tatusova, PhD
National Center for Biotechnology Information
National Library of Medicine
National Institutes of Health
Bethesda, MD 20894, USA
Voice: (301)435-5756
Fax: (301)480-9241
email tatiana@ncbi.nlm.nih.gov



---



- gttaacaattaaagagtgtttatcgaaattcattatatagtggtttatatagaccacttc
-
- GenBank newsgroup see: http://www.bio.net/hypermail/genbankb/       
- GENBANKB e-mail: messages sent to genbankb@net.bio.net
- subscribe: e-mail biosci-server@net.bio.net with: subscribe genbankb
- unsub: e-mail biosci-server@net.bio.net with: unsubscribe genbankb      
- GenBank on the WWW, see:  http://www.ncbi.nlm.nih.gov/Genbank/
- problems with GENBANKB? E-mail moderator: francis@cmmt.ubc.ca                  





From owner-genbankb@hgmp.mrc.ac.uk  Fri Mar 31 22:56:27 2000
Return-Path: <owner-genbankb@hgmp.mrc.ac.uk>
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 110)
	id 870E417B2A; Fri, 31 Mar 2000 22:56:26 +0100 (BST)
Received: by mercury.hgmp.mrc.ac.uk (Postfix, from userid 1)
	id 6777F17B21; Fri, 31 Mar 2000 22:56:25 +0100 (BST)
To: genbank@net.bio.net
Newsgroups: bionet.molbio.genbank
Date: 31 Mar 2000 17:55:49 +0100
From: Mark Cavanaugh <cavanaug@lagrange.nlm.nih.gov>
Subject: GenBank : Error in GbUpdate file nc0331.flat
Message-Id: <20000331215625.6777F17B21@mercury.hgmp.mrc.ac.uk>
Sender: owner-genbankb@hgmp.mrc.ac.uk
Precedence: bulk

Greetings GenBank Users,

GenBank Update file nc0331.flat.Z, installed on NCBI's ftp
server at 03:34am (EST) on March 31, 2000, contained a
truncated record:

LOCUS       AP001552   150120 bp    DNA             PLN       30-MAR-2000
DEFINITION  Oryza sativa geneomic DNA, chromosome 1, PAC clone:P0029D06.
ACCESSION   AP001552
VERSION     AP001552.1  GI:7363267
....
    79861 gtaaaatgag aaaacaaaaa tatattgctt ctgtttcgtt tgactttttt cttagtcaat
    79921 gttttttaga tttgactaag tttatagaaa caatggcaat atttaaaaca ctaaLOCUS       
AC022333  
 173907 bp    DNA             PRI       30-MAR-2000
DEFINITION  Homo sapiens Chr3 NOVECTOR RP11-64O13 () complete sequence.
ACCESSION   AC022333


This problem has been fixed, and a new patched version of
nc0331.flat.Z was installed at 11:42am (EST) on March 31.

The ASN.1 version of the update (nc0331.aso.Z) was not affected.

If your processing of this update was incomplete due to the truncated
record, we suggest that you obtain the new version of the file and
re-process.

Our apologies for any inconvenience that this data error has caused.

Mark Cavanaugh
GenBank
NCBI/NLM/NIH


---



- gttaacaattaaagagtgtttatcgaaattcattatatagtggtttatatagaccacttc
-
- GenBank newsgroup see: http://www.bio.net/hypermail/genbankb/       
- GENBANKB e-mail: messages sent to genbankb@net.bio.net
- subscribe: e-mail biosci-server@net.bio.net with: subscribe genbankb
- unsub: e-mail biosci-server@net.bio.net with: unsubscribe genbankb      
- GenBank on the WWW, see:  http://www.ncbi.nlm.nih.gov/Genbank/
- problems with GENBANKB? E-mail moderator: francis@cmmt.ubc.ca                  





