We use GenBank as our primary nucleic acid sequence database, but also maintain
a local copy of those entries in EMBL whose primary accession numbers do not
appear in GenBank. This was about 10000 entries as of EMBL release 34, and is,
I gather, dropping as GenBank, EMBL, and DDBJ sort out the backlog. Nontheless,
as long as there remain 'unique' EMBL entries, we would like to keep them
available to local users. Until this summer, we were grateful users of Mike
Cherry's ftp site in the US to obtain the 'unique' dataset, but that has been
closed due to concerns about the network load generated by the transAtlantic
transfers.
I am coming, then, hat in hand, to ask whether anyone has this dataset
available? I would be happy to mirror it for anonymous ftp here, and don't see
any need for incremental updates, since I expect very few of these 'unique'
entries are corrected or altered without a corresponding addition to GenBank.
Since I'm using this via the GCG package, I can take the dataset as flatfile or
in GCG format.
Many thanks to all for their consideration and advice.
Regards,
Charles Bailey
!-------------------------------------------------------------------------------
! Dept. of Genetics / Howard Hughes Medical Institute
! University of Pennsylvania School of Medicine Rm. 430 Clinical Research Bldg.
! 422 Curie Blvd. Philadelphia, PA 19104 USA Tel. (215) 898-1699
! Internet: bailey at genetics.upenn.edu (IN 128.91.200.37)
!-------------------------------------------------------------------------------