NCBI Data Repository CD-ROM

Scott Federhen federhen at
Thu Jun 11 10:46:22 EST 1992

The first release of the NCBI Data Repository CD-ROM is now available.
The NCBI Data Repository was established as a service for providing a
public distribution site for databases maintained by individual developers
or groups. The databases and software are not officially supported or 
maintained by NCBI nor does NCBI assume responsibility for the accuracy
or reliability of the data or software. Each data collection is solely 
the responsibility of the individual developer and the data is made 
available by NCBI 'as is'.

The Data Repository CD-ROMs are currently being distributed at no charge
on an experimental basis; a subscription service may be set up for future
releases. New releases are planned for every six months, with the next
release scheduled for October, 1992. The frequency of releases may be 
increased depending on the demand for the CD-ROMs and the updating 
frequency of the individual databases.

The Data Repository is also accessible over the Internet by anonymous
FTP to '' ( Under the directory 'repository',
each collection of data is stored in individual subdirectories and is
accompanied by README files for file descriptions and the names of 

Questions, suggestions, requests for copies of the CD-ROM, and
proposals for additions to the repository should be addressed to
'repository at', or:

			NCBI Data Repository
			National Library of Medicine
			Bldg. 38A, Rm 8N-803
			Bethesda, MD 20894
			Phone:  (301) 496-2475

Scott Federhen
Manager, NCBI Data Repository


tfd -  Transcription Factor Database.  A relational database of transcription
       factors maintained by David Ghosh (ghosh at, NCBI.
       Last update: Mar. 10, 1992.

ngdd - Normalized Gene Designation Database.  Normalized gene maps for E.coli,
       Salmonella, Bacillus Subtilus, Pseudomonas aeruginosa, and Caulobacter
       crescentus from Yvon Abel and Robert Cedergen, University of Montreal.
       Last update: Jun. 25, 1990.

epd -  Eukaryotic Promoter Database. A collection of biologically functional,
       experimentally defined RNA POL II promoters active in higher eukaryotes.
       Maintained by Philipp Bucher (Philipp.Bucher at
       Last update: Apr. 3, 1992.

limb - LIsting of Molecular Biology databases. A collection of information
       about the content and maintenance of a large number of databases of
       interest to the molecular biology community. Maintained by
       Graham Redgrave (gwr at, Los Alamos National Laboratory.
       Last update: Mar. 26, 1991.

compound - A knowledge base of compounds involved in intermediate 
       metabolism. Maintained by Peter Karp, SRI. (pkarp at
       Last update: Jan. 29, 1992.

metproto - A database of metabolic reactions, and associated DOD software.
       Maintained by Ray Ochs, Kansas State University. (rso2 at
       Last update: Apr. 24, 1992.

rebase - Restriction Enzyme Database. A collection of information about
       restriction enzymes, their cutting sites and commercial sources.
       Maintained by Richard Roberts, Cold Spring Harbor Laboratory.
       (roberts at
       Last update: Mar. 16, 1992.

prosite - An annotated database of protein sequence motifs. Maintained by
       Amos Bairoch, University of Geneva. (bairoch at
       Last update: Mar. 13, 1992.

enzyme - The Enzyme Data Bank, a database of information about enzymes,
       including names, catalytic activity, cofactors, and pointers to
       relevant entries in sequence databases. This directory also includes
       an ASN.1 encoding of the database. Maintained by Amos Bairoch,
       University of Geneva. (bairoch at
       Last update: Mar. 13, 1992.

eco -  An E. coli genomic database. This directory includes DOS and Mac
       software. Maintained by Kenn Rudd, NCBI. (rudd at
       Last update: Jan. 7, 1992.

flybase - The Drosophila Genetic Database, the genomic database for the
       fruit fly Drosophila melanogaster. Maintained by Michael Ashburner,
       (ma11 at
       Last update: Mar. 9, 1992.

acedb - A C. elgans Database, the genomic database for the nematode
       Caenorhabditis elegans. This directory includes software and 
       an installation script for running the system on several hardware
       platforms, including SPARCstations, DECstations, and SGIs.
       Maintained by Richard Durbin (rd at 
       and Jean Thierry-Mieg (mieg at frmop11.bitnet)
       Last update: Apr. 24, 1992.

kabat - A collection of sequences of immunological importance, including
       protein and nucleic acid sequences and alignments. Compiled by
       Elvin Kabat (kabat at Maintained by Harold Perry
       (hperry at
       Last update: Mar. 9, 1992.

aids-db - A collection of sequences related to the HIV family of viruses.
       Gerry Myers, LANL. (glm at
       Kersti MacInnes, LANL. (kam at
       Last update: Apr. 22, 1992.

carbbank - A PC-based database and software system which contains 
       information about the structure of complex carbohydrates. This
       includes the Complex Carbohydrate Structure Database (CCSD) and
       the CarbBank software system.
       Maintained by Dana Smith, Scott Doubet and Peter Albersheim.
       (CarbBank at UGA.bitnet) or (76424.1122 at
       Last update: Mar. 9, 1992.

blocks - A database of protein sequence homology blocks, constructed
       from SwissProt and PROSITE. Includes unix and dos software
       packages used to make the database. Maintained by Steven and
       Jorga Henikoff. (henikoff at
       Last update: Feb. 28, 1992.

t4phage - A genomic database for the T4 phage. Maintained by Elizabeth
       Kutter, University of Washington (t4phage at and
       David Batts, Evergreen State College (t4 at
       Last update: Mar. 25, 1992.

eco2dbase - The E. coli gene-protein database, which links information
       about E. coli genes and their protein spots on 2-D gels.
       Maintained by Frederick C. Neidhart, University of Michigan.
       Last update: Mar. 10, 1992.

pkinases - A non-redundant annotated collection of protein kinase
       sequences. Maintained by Anne Marie Quinn, Salk Institute.
       (quinn at

rldb - The Reference Library DataBase, a collection of information 
       about the chromosomal locations of a set of publicly available
       DNA probes. Maintained by Guenther Zehetner, Imperial Cancer
       Research Fund, Genome Analysis Laboratory. (G_Zehetner at
       Last update: Apr. 23, 1992.

Scott federhen at

More information about the Bioforum mailing list