new multiple sequences alignment server

Guy Baudoux baudou at lysine.biq.fundp.ac.be
Tue Sep 3 18:26:42 EST 1996



   MM           MM                                 BBBBBBB
   MMM         MMM                                 BB     B
   MM M       M MM     AA   TTTTTT CCCC HH   HH    BB     B  OOOOO  XX 
 X
   MM  M     M  MM    AAAA    TT  CC    HH   HH == BBBBBBB  OO   OO  XX
X 
   MM   M   M   MM   AA  AA   TT  CC    HHHHHHH == BB     B OO   OO  
XX
   MM    M M    MM  AAAAAAAA  TT  CC    HH   HH    BB     B OO   OO  X
XX
   MM     M     MM AA      AA TT   CCCC HH   HH    BBBBBBB   OOOOO  X  
XX



         Match-Box Server 1.1 multiple sequence alignment server.

           Guy Baudoux, Isabelle Reginster, Pascal Briffeuil,
                   Eric Depiereux and Ernest Feytmans.

   Laboratory of Structural Molecular Biology, University of Namur,
Belgium

             Rue de Bruxelles, 61   :  Tel: +32-81-72-44-15   
             5000 Namur, Belgium    :  Fax: +32-81-72-44-15
                 E-mail: matchbox-help at biq.fundp.ac.be

              Project supported by the Walloon Government,
                        Federal State of Belgium.


PRESENTATION:

 We are pleased to announce the availability of a new sequence
alignment
server, based on the Match-Box software developped by Drs. Eric
Depiereux 
and Ernest Feytmans (1,2). 

 The Match-Box software proposes protein sequence alignment tools based
on
strict statistical criteria. The method circumvents the gap penalty 
requirement: in the Match-Box method the gaps are the result of the
alignment
and not a governing parameter of the matching procedure.

The method produces reliable results, as assessed by the tests
performed on
protein families of known structures and of low sequence similarity.

 A reliability score is computed in relation with a threshold of
similarity 
progressively raised to extend the aligned regions to their maximal
length.
The score obtained at each position of the final alignment is printed
below 
the sequences and allows a discriminant reading of each aligned region.

 Several additional outputs present pairwise similarity analyses in
order to
allow delineation of relevant subsets of related sequences and to avoid

alignment of unrelated sequences.


EXAMPLE:

The lysozyme and alpha-lactalbumin is a well-known family of enzymes
discussed in depth by H.A.McKenzie and F.H.White (3). The sequences of
the
lysozymes structures 1ALC, 1LZ1, 2LZ2 and 2LZT were aligned with the
Match-Box server that gives the following result: 

  Sequences number, length and name
  _________________________________

  1   122 1ALC        2   130 1LZ1        3   129 2LZ2        4   129
2LZT      

              10        20        30        40        50        60     
  70
               +         +         +         +         +         +     
   +
   1 
kqftkcelsqnlyd--idgygrialpelictMFhtsgydtqai--vendesteyglfqisnalwckssqs
   2 
kvfercelartlkrLGmdgyrgislanwmclAKwesgyntratNYnagdrstdygifqinsrywcndgkt
   3 
kvygrcelaaamkrLGldnyrgyslgnwvcaAKfesnfnthatN-rntdgstdygilqinsrwwcndgrt
   4 
kvfgrcelaaamkrHGldnyrgyslgnwvcaAKfesnfntqatN-rntdgstdygilqinsrwwcndgrt

      11111111111113  333333333444444  2222222222 
1111111111111111111111111

              80        90       100       110       120       130     
 140
               +         +         +         +         +         +     
   +
   1  pqsrnicditcdkflddditddimcakkild-ikgidywiahkalctEKLEQWLCEK---
   2  pgavnachlscsallqdniadavacakrvvrDpqgirawvawrnrcqNRDVRQYVQGCGV
   3  pgsknlcnipcsallssditasvncakkiasGgngmnawvawrnrckGTDVHAWIRGCRL
   4  pgsrnlcnipcsallssditasvncakkivsDgngmnawvawrnrckGTDVQAWIRGCRL

      1111111111111111111111111111111 111111111111111                  
    

 
 The Match-Box program tries to find "boxes" (segments that are similar
in 
all sequences submitted by the user): residues included in the boxes
are 
printed in lowercase. Residues in upper case are not aligned, and gaps
are 
placed arbitrarily before the next box. 

 A reliability score is written below each position in the boxes. It is

related to the statistical significance of the alignment at this
position:
the lowest scores corresponds to the highest reliability of the
alignment.
 


ACCESSIBILITY:

The Match-Box server is available at the Web site:

  http://www.fundp.ac.be/sciences/biologie/bms/matchbox_submit.html

Using this web page, you can submit up to 50 protein sequences of no
more
than 2000 amino-acids each and receive by e-mail 2 output listings,
resulting
from two different analysis:

  * explore : analyses the global similarities between the sequences.

  * align   : produces boxes (blocks) detected in the sequences and
              arranged in the form of a multiple alignment.

 This server can also be reached by electronic mail at the address:

  matchbox at biq.fundp.ac.be

where you can send the key word "help" in the body of the message to
receive
informations about the methods of e-mail submission of data.

 In a preliminary test period, the server is accessible to any person
from
academic or commercial institutions. A version of the software, as a
standalone program with a graphic interface is under developement.


INPUT:

 The sequences can be submitted in FASTA-like, HSSP, or MSF formats,
and the scoring matrices used by the programs can be selected directly
from
the web page.


OUTPUT:

 The Match-Box server returns by default two listings "explore" and
"align".
Moreover the user can also ask to receive an alignment in MSF or HSSP
format, 
and a picture of the alignment in PostScript format.


DOCUMENTATION:

 The web page gives access to a series of documents: general
presentation,
details about the method, examples with explanation of the results.


REFERENCES:

 1. Depiereux, E. & Feytmans, E. (1991). Simultaneous amd multivariate 
    alignment of protein sequences: correspondance between
physicochemical 
    profiles and structurally conserved regions (SCR). Prot. Engng.
4(6),
    603-613.
 2. Depiereux, E. & Feytmans, E. (1992). MATCH-BOX - A fundamentally
new 
    algorithm for the simultaneous alignment of several protein
sequences. 
    Comput. Appl. Biosci. 8(5), 501-509.
 3. McKenzie, H.A. & White, F.H. (Jr) (1991). Lysozyme and
alpha-lactalbumin,
    function and interrelationships. Adv. Prot. Chem. 41, 173-258.






 




More information about the Comp-bio mailing list