SANBI EST clustering benchmark dataset

winhide winhide at sanbi.ac.za
Fri Sep 3 05:12:06 EST 1999


SANBI is making available a dataset of masked ESTs suitable for
benchmarking. We are keen to evaluate the *hardware* performance of
clustering applications, and also the *clustering* performance and accuracy.
The dataset represents a randomly chosen set of Human eye-expressed ESTs
that have been masked for repeats and vector sequences. It has not as yet
been assigned to 'true' gene classes, as these have not all been assigned
against available genome data.

The dataset can be found at ftp.sanbi.ac.za/STACK/benchmarks/

The dataset is made available with the proviso that results of benchmarking
should be
made broadly available.  Our own results and a suggested format are found in

ftp.sanbi.ac.za/STACK/benchmarks/README

Algorithmic benchmarks can be found at
ftp.sanbi.ac.za/STACK/benchmarks/ALGO_BENCH

Unfortunately, due to spamming, uploads to the FTP site are not possible.
Please email results including clusters if possible to info at sanbi.ac.za and
they will be posted.

Win Hide, Alan Christoffels, Andrey Ptitsyn and Antoine van Gelder









More information about the Bio-soft mailing list