Linux vs Unix for doing bioinformatics

Aaron J Mackey ajm6q at virginia.edu
Thu Mar 2 08:51:57 EST 2000


On 2 Mar 2000, Peter C. Tribble wrote:

> In article <6lp*iu1lo at news.chiark.greenend.org.uk>,
> 	Tim Cutts <timc at chiark.greenend.org.uk> writes:
> > 
> > Running BLAST on NFS-mounted directories is a bad idea; BLAST 2 memory
> > maps the file, and the performance of a mmap'ed file across NFS is
> > appalling.
> 
> Indeed? We run blast across nfs; once the file is pulled across into
> memory then the machines in the farm run flat out with no network
> traffic (until they need to get another database that isn't in memory).

This is also our experience, whether we use memory-mapped databases or not
(page buffers actually seem to keep much of the data around, reducing the
need to re-read the database on subsequence executions in the absence of
memory-mapped databases).

When we run a farm-tasking implementation of fasta/ssearch/blast across
our cluster and use NFS-shared databases, the first database read is very
slow, to be sure.  But subsequence performance is remarkably fast (and I'm
considering 100's of processors here, by the way ...)  I've also tried
copying the databases to each machine's local /tmp directory and running
all the searches and while it's true that the first search takes far less
time, the subsequent searches all run about the same speed, which for a
large "database vs. database" type of run equates to not much difference
at all between NFS and locally mounted databases (and actually, given all
the copying time required to get the database onto each of the machines,
it may be a negative to go that route, unless your database is very
static).

But this has little to do with Linux vs UNIX.

-Aaron

-- 
 o ~   ~   ~   ~   ~   ~  o
/ Aaron J Mackey           \
\  Dr. Pearson Laboratory  / 
 \ University of Virginia  \     
 /  (804) 924-2821          \
 \  amackey at virginia.edu    /
  o ~   ~   ~   ~   ~   ~  o







More information about the Bio-soft mailing list