[ANNOUNCE] PRINTS alignment compendium (ALIGN)

ajb at s-crim1.dl.ac.uk ajb at s-crim1.dl.ac.uk
Mon Jun 13 18:39:07 EST 1994

The ALIGN compendium to the PRINTS protein fingerprint database is now
available via anonymous ftp from
   s-ind2.dl.ac.uk         pub/database/prints/align
   ncbi.nlm.nih.gov        repository/PRINTS/align

The readme is appended below

Alan Bleasby
DRAL Daresbury Laboratory
Warrington WA4 4AD

            *                                                      *
            *                                                      *

                            PRINTS5.0 ALIGNMENTS
                Departments of Biochemistry & Molecular Biology
                 University College London, London WC1E 6BT, UK
                   The University of Leeds, Leeds LS2 9JT, UK
                          attwood at bsm.bioc.ucl.ac.uk
                          bmb5meb at biovax.leeds.ac.uk

                        Creation date: 2nd June 1994
                        Compiled by: T.K.ATTWOOD & M.E.BECK

This compendium of protein sequence alignments is a companion resource to
the PRINTS database of protein motif fingerprints [1]. For each entry in 
PRINTS, we have made available a corresponding alignment in NBRF format: the 
root name of each of these is identical to the PRINTS identification code. 

Fingerprints are derived from groups of conserved motifs in multiple 
alignments. These are used to dredge the OWL composite sequence database [2]
in an iterative fashion, so the fingerprint matures with each database pass 
[3-5] - further details of the nature and derivation of fingerprints are given 
in the PRINTS readme and documentation files. Both starting alignments and 
their resulting fingerprints thus stem directly from OWL. 

Within OWL, sequences retain the database identification codes of their primary
sources (except those from NRL-3D, which for convenience are prefixed by NRL_).
These codes often change between source releases, so alignments (and their 
fingerprints) derived from early versions of OWL will include original rather 
than current database codes. A simple method for retrieving the current code,
if this should prove desirable, is to use OWL's query language DELPHOS, which 
is accessible from SEQNET (e.g. within DELPHOS type: /info seq "string" , where
`string' is part of the sequence whose current code you wish to retrieve).


The alignments are, in the main, only intended to be reliable in the regions 
from which fingerprints have been defined, although many are complete over the 
full sequence length. Each has been generated manually, using either SOMAP [6]
(part of the ADSP suite [3]), XALIGN or VISTAS [7]. We make no claims for their
`correctness' (if such a thing exists), but provide them in good faith as a 
guide to, or as an illustration of, the type of protein families contained in 
PRINTS. We hope they will be of use to those wishing to augment the information 
contained in PRINTS, or to others who simply seek a convenient starting point 
for their own analyses - the files should be accessible to any software that
reads NBRF format.

VISTAS and XALIGN will shortly be available from the DRAL SEQNET service.

1. Attwood, T.K. and Beck, M.E. (1994) PRINTS - A protein motif fingerprint
database. Protein Engineering, 7 (7), in press.
2. Bleasby, A.J. and Wootton, J.C. (1990) Constructing validated, non-
redundant composite protein sequence databases. Protein Engineering, 3 (3), 
3. Parry-Smith, D.J. and Attwood, T.K. (1992) ADSP - A new package for
computational sequence analysis. CABIOS, 8 (5), 451-459.
4. Attwood, T.K. and Findlay, J.B.C. (1994) Fingerprinting G-protein-coupled
receptors. Prot.Engng. 7 (2), 195-203.
5. Attwood, T.K. and Findlay, J.B.C. (1993) Design of a discriminating finger-
print for G-protein-coupled receptors. Prot.Engng. 6 (2) 167-176.
6. Parry-Smith, D.J. and Attwood, T.K. (1991) SOMAP - A novel interactive
approach to multiple protein sequence aligment. CABIOS, 7 (2), 233-235.
7. Perkins, D.N. and Attwood, T.K. VISTAS - A package for VIsualising
STructures And Sequences. In preparation.

More information about the Bionews mailing list