Genbank FASTA server -> NCBI?

David Mathog mathog at seqvax.caltech.edu
Mon Sep 28 14:03:00 EST 1992


(Warning: I'm getting up on the soapbox for this one.  Don't take this 
personally Dennis, it's not aimed at you.)

Dennis Benson's reply to my post:

>:
>: Is the NCBI going to continue the Genbank Fasta e-mail server?
>: If yes, could someone please post the details?
>: If no, why not?
>
>  The primary reason is machine resources.  We have BLAST running
>  on a parallelized version of an 8-processor Silicon Graphics.
>  This provides very rapid turn-around for searches and allows us
>  to handle the daily load of between 1000 and 2000 searches per
>  day, including hundreds of BLASTX and TBLASTN searches which are
>  heavily CPU-intensive.  We simply would not have the capacity to
>  accommodate FASTA searches in addition to this machine load and
>  still provide reasonable response times.
>

Here we go again.  In the bad old days we ran local FASTA's on our
VAXstation.  Then we offloaded these jobs to the Genbank FASTA server. This
had the following beneficial effects: 

  1.  Eliminated at least half of the CPU load, and so greatly improved the
      response time for other applications.
  2.  Eliminated most of the disk IO load.
  3.  Provided access to current databases and eliminated the need to
      update our local database daily. (Same holds for the BLAST server)
  4.  Gave BETTER response time than running locally!
  5.  Freed up a Sparc1 that had been purchased solely to run FASTA jobs
      for ACEDB, AATDB, Jackson Lab's mouse database, etc.

Bet a lot of other sites out there can make similar statements.

Ex Genbank-FASTA users need to replace this service, and soon (ie,
TOMORROW). Speaking as a US taxpayer, I'd say that it would make a bit more
sense for the NCBI to scrape up the money to get that FASTA server back on
line than it does for the NIH/NSF to spend many times that amount putting
new hardware in (50?) sites like mine!  I don't have the figures, but you
folks have a budget of what, >1M$/year? I just do NOT believe that the NCBI
does not have the resources to put together a FASTA server. 

I'm also _appalled_ that the Genbank contract did not require that the
FASTA server be maintained.  Of course, it fits the pattern.  As far as I 
can tell nobody in the NIH is minding the store when it comes to planning
for software/hardware costs in the mol. bio. community.  To wit:

> 
>  We also plan to make the sequence data, organized for sequence
>  similarity searching, available on a separate CD-ROM with 
>  the FASTA program included (thanks to assistance from Bill
>  Pearson) for PCs and Macs.  We expect to offer this through the Govt.
>  Printing Office by the end of the year.
> 

At first this CD-ROM sounds like it might be a good thing for an
independent lab to have.  Let's take a second to analyze this.  Imagine
only 100 labs go for this set up, and that they don't currently have
drives.  What would 100 CD-rom drives cost (100 x $700 = $70000)?  This
won't be too useful unless the databases are kept current.  How much will
quarterly updates cost each year (100 x (?)$200 = $20000)?  Probably all of 
these people already have access to e-mail.  Seems to me that they would
be better served by a FASTA server - how much work are they going to get
done on those machines while FASTA is running?  Besides, $90000 for the 
first year and $20000/year thereafter is certainly more than enough
to cover the costs of the FASTA server. 

(Stepping down from the soapbox.)

As for the remaining servers, the two times that I tested the EMBL FASTA
server the turnaround time was measured in days.  Is this typical or did I
hit them at a bad time? I've not tried FLAT yet, but looks like I'm going
to have to.

So, question for the managers of the EMBL/FLAT FASTA servers: can your
servers handle the extra load? 


Regards,

David Mathog
mathog at seqvax.caltech.edu
manager, sequence analysis facility, biology division, Caltech




More information about the Bio-soft mailing list