Performance comparisons of blast and fasta programs?

Bill Pearson wrp at cyclops.micr.Virginia.EDU
Tue Apr 13 08:00:32 EST 1993


In article <1993Apr13.110138.5033 at gserv1.dl.ac.uk> tekaia at pasteur.fr (Fredj Tekaia) writes:

>I am interrested in comparing the performances of the 2 family programs :
>BLAST and FASTA. Performances in terms of speed of execution and accuracy
>of the results.

	There is a brief discussion of the relative performance of
blast and FASTA in Pearson, WR (1991) Genomics "Searching Protein
Sequence Libraries: Comparison of the Sensitivity and Selectivity of
the Smith-Waterman and FASTA Algorithms" 11:635-650.

	Since then, I have done more extensive comparisons and am writing
up the results.  A brief summary:

	BLASTP performs about as well as FASTA with ktup 1 - 1.5
(which does not exist).  It is better than FASTA with ktup=2.  It is
not as good as Smith-Waterman.  FASTA with -o and ktup=2 or ktup=1
performs better than BLASTP, and as well as Smith-Waterman. (-o tells
FASTA to calculate an "optimized" score - a score with gaps - for
every sequence in the database.)

>In fact I would like to know if for comparing a test sequence to a database :
>Is it best to do it by :
>a) fasta programs, b) the blast programs or c) both programs?

	I would run BLAST first.  If BLAST fails to find any
significant matches, I would run FASTA with -o and ktup=1.  This is
substantially slower than BLAST, but will not miss anything
significant.

	At the moment, the main advantage of BLAST is that you can run
it on the NCBI non-redundant sequence database.  When this database
becomes generally available, it should be used for FASTA searching
with the -o option.

Bill Pearson




More information about the Bio-soft mailing list