Performance comparisons of blast and fasta programs?
wrp at cyclops.micr.Virginia.EDU
Tue Apr 13 08:00:32 EST 1993
In article <1993Apr13.110138.5033 at gserv1.dl.ac.uk> tekaia at pasteur.fr (Fredj Tekaia) writes:
>I am interrested in comparing the performances of the 2 family programs :
>BLAST and FASTA. Performances in terms of speed of execution and accuracy
>of the results.
There is a brief discussion of the relative performance of
blast and FASTA in Pearson, WR (1991) Genomics "Searching Protein
Sequence Libraries: Comparison of the Sensitivity and Selectivity of
the Smith-Waterman and FASTA Algorithms" 11:635-650.
Since then, I have done more extensive comparisons and am writing
up the results. A brief summary:
BLASTP performs about as well as FASTA with ktup 1 - 1.5
(which does not exist). It is better than FASTA with ktup=2. It is
not as good as Smith-Waterman. FASTA with -o and ktup=2 or ktup=1
performs better than BLASTP, and as well as Smith-Waterman. (-o tells
FASTA to calculate an "optimized" score - a score with gaps - for
every sequence in the database.)
>In fact I would like to know if for comparing a test sequence to a database :
>Is it best to do it by :
>a) fasta programs, b) the blast programs or c) both programs?
I would run BLAST first. If BLAST fails to find any
significant matches, I would run FASTA with -o and ktup=1. This is
substantially slower than BLAST, but will not miss anything
At the moment, the main advantage of BLAST is that you can run
it on the NCBI non-redundant sequence database. When this database
becomes generally available, it should be used for FASTA searching
with the -o option.
More information about the Bio-soft