ANNOUNCING-SCANPS (Scan Protein Sequence)

G.J. Barton mbgjb at
Thu Sep 1 05:18:48 EST 1994

Announcing SCANPS

SCANPS (pronounced Scan-P-S) stands for SCAN Protein Sequence.  The
main function of SCANPS is to use a rigorous local alignment method to
search protein sequence databases with a query sequence or multiple
alignment.  SCANPS is fast enough to use on an ordinary workstation
(Barton, 1992).

SCANPS also allows all pairwise comparisons to be made between a set
of sequences and can estimate the statistical significance of the
alignments.  SCANPS has been used in the analysis of many protein
families.  For example, the discovery of similarity between PD-ECGF
(Platelet Derived Endothelial cell growth factor) and TP (Thymidine
Phosphorylase) (Barton, 1992).  The program was also used to find the
similarity between E. coli diadenosine tetra-phosphatase and the
protein Ser/Thr phosphatases (Barton, et al, 1994).

Principal features of SCANPS

Efficient finding of Nearly-ALL local alignments (the NALL method)
(Barton, 1993) that score above a cutoff or probability threshold,
between the sequence and a database.  This means if two proteins have
more than one common region, most regions are reported.  Effectively,
this is similar to BLAST (Altschul et al, 1990) but with gapped

Efficient implementation of the Smith-Waterman Algorithm - this
returns the highest scoring local alignment between two sequences
including gaps where necessary.  The program is approximately a factor
of three faster than sssearch.  

Estimation of the significance of the local alignments.  An empirical
method is used which takes into account the alignment score and the
alignment length.  This has the effect of pushing unusually high
scoring, but short alignments higher up the hit list.

Comparison of all pairs of sequences in a set using either the
Smith-Waterman, or NALL methods.


The SCANPS program has been used as a test bed for a lot of studies,
many of which are not yet published.  When the work is published, I
will try to clean up the source code and distribute it.  Currently, I
can not be sure that the code will compile on all ANSI-C compilers, so
for the time being, I am making precompiled binaries available for Sun
(SunOS 4.1.3) and Silicon Graphics (IRIX 5.x).

The programs are available by anonymous ftp from
in the subdirectory programs/scanps.  You can also reach this
directory using a WWW browser such as Mosaic
(URL=  On the same server you can read
preprints of related papers on line, or download PostScript copies.

If you download the programs please send me a short email with your
name, affiliation and address.  I will add you to my user database and
send you an email when the programs are updated and/or sources are
made available.


S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman.
J. Mol. Biol., 215:403-410, 1990.

G. J. Barton.
Comput. Appl. Biosci., 9:729-734, 1993.

G. J. Barton, P. T. C. Cohen, and D. Barford.
Eur. J. Biochem., 220:225-237, 1994.

G. J. Barton, C. P. Ponting, G. Spraggon, C. Finnis, and D. Sleep.
Protein Science, 1:688-690, 1992.

G. J. Barton (1992), Science, 257, 1609.

Geoffrey J. Barton
Laboratory of Molecular Biophysics, University of Oxford
Rex Richards Building, South Parks Road, Oxford OX1 3QU, U.K.

email:  gjb at    Telephone: +44 865 275368    Fax: +44 865 510454 
anonymous-ftp:             WWW:

More information about the Bio-soft mailing list