"SynCron" - Tools for EMBL Database Update

Matteo diTommaso ditommaso at ebi.ac.uk
Mon Apr 8 23:39:21 EST 1996


Subj="SynCron" tools for maintaining synchronised copies of the EMBL 
Nucleotide Sequence Database  

Looking at the statistics for access to the EMBL Nucleotide Sequence Database 
update files on the EBI ftp site (ftp://ftp.ebi.ac.uk/pub/databases/embl/new/) 
we observe that many people download the full cumulative data file 
(cumulative.dat) rather than the daily update files.  In an attempt to make 
daily update files more useful and to provide a reliable mechanism for 
re-generating the cumulative.dat file locally from daily updates, the EBI and 
the Swiss EMBNet node have jointly developed a set of tools which can be used 
to fetch the daily updates and update the local database.

The programs make use of 'transaction listings' made available on the EBI ftp 
site. These transaction listings are now supplied with every update file and 
include a record of each update, insert and delete operation to the EMBL 
Nucleotide Sequence Database as represented in the flat-file updates.  The 
naming scheme for transaction listings is the same as for daily, weekly, and 
cumulative updates with the extension ".lis".  The transaction listings are 
found in:

ftp://ftp.ebi.ac.uk/pub/databases/embl/new/list/

and look like:

Acc#   ID  Action DateStamp  Ver# Division
T58328 AA328 U 19951108232958 3    EST
T58329 AA329 C 19951108233007 3    EST
T58330 AA330 D 19951108233015 3    EST
R67977 AA977 U 19951108230600 3    EST
R67978 AA978 U 19951108230611 3    EST 
where U=Update C=Create D=Delete

Using the tools it is possible to regenerate the cumulative.dat file (at a 
remote site) reliably from daily updates.  Validation of the new 
cumulative.dat file is also possible using the transaction listing provided at 
the EBI.

Using these programs it should be possible to keep a copy of the EMBL 
Nucleotide Sequence Database that exactly matches the contents of the database 
in operation at the EBI for external services with manual intervention 
required only in the event of some failure in network transfer of the file - 
etc.

These programs are available by anonymous ftp from (the _002 version number 
will change as the programs are updated):

UNIX Version:
ftp://ftp.ebi.ac.uk/pub/software/unix/listtools/SynCron_002.tar.gz

VMS Version:
(backup/gzip)
ftp://ftp.ebi.ac.uk/pub/software/vms/listtools/SynCron_002.bck-gz
OR
(tar/compress)
ftp://ftp.ebi.ac.uk/pub/software/vms/listtools/SynCron_002.tar_Z


Matteo diTommaso
Database Programming Group
EMBL Outstation
The European Bioinformatics Institute
E-mail:   ditommaso at ebi.ac.uk  


Nicole Redaschi and Reinhard Doelz
Biozentrum - University of Basel
EMBnet Node Switzerland  
E-Mail: embnet at comp.bioz.unibas.ch







More information about the Bio-soft mailing list