IUBio

ANNOUNCE ProAnWin: protein alignment and structure-activity analysis

Alexy Eroshkin eroshkin at vector.nsk.su
Thu Oct 3 06:21:58 EST 1996


To: schisto at net.bio.net
From: Alexey Eroshkin <eroshkin at vector.nsk.su>
Subject: ANNOUNCE ProAnWin: protein alignment and structure-activity analysis


           **************************************
           ProAnWin - Protein Analyst for Windows
           **************************************

    State Research Center of Virology and Biotechnology
        Koltsovo, Novosibirsk Region, 633159 Russia
                           and
Irina Pika, Anatoly Frolov, Vladimir Ivanisenko with Alexey Eroshkin

are pleased to announce the availability of new MS Windows
application for multiple protein sequence alignment, comparative
sequences analysis, studying protein structure-activity
(property/phenotype) relationships and designing site-directed
mutagenesis.

DESCRIPTION:

ProAnWin studies the relationships between protein/peptide activity
(or property or related phenotype) and characteristics of some
regions in primary or tertiary structure of these molecules.
Structure-activity analysis is based on the sequences of protein
family, data on protein activity (pK, ED50, Km or any other) and, if
available, 3D structure of one of these proteins (supposing the
common 3D fold for all the homologs). The main aim is to find out the
factors responsible for the variation of protein activities: location
of activity-modulating site and important structural characteristics
of the site.

The program makes the following: input of sequences from several
formats (SWISS-PROT, PIR, FASTA, GCG, CLUSTAL) and 3D structure in
PDB format; flexible multiple protein sequences alignment and
threading sequences into known 3D structure (ClustalV + manual
alignment); input of user-defined protein activities, properties or
related phenotypes (with possibility to transform activity: log(x),
1/x, etc.); calculation of many characteristics (hydrophobicity,
amphipathicity, etc.) of linear and spatial protein sites; fast
multiple (up to eight independent factors) linear regression analysis
of structure-activity relationships; activity prediction for untested
or mutated proteins; data visualization (regression plots, 3D
pictures with sites highlighted, multiple alignments); displaying
found sites on sequences and 3D structure. The program has two main
related windows - with protein sequences and with 3D structure; any
site highlighted in sequences is highlighted in 3D structure and vise
versa.

ProAnWin aligns complete set of sequences, subset or any selected
block, providing thus possibility for iterative alignment that
preserve some previously found blocks or those imposed from some
biological data (active center, catalytic residues).

The program can be applied to analysis of various protein-related
biological data, to prediction of activity (phenotype) of newly
sequenced proteins and to simulation of protein-engineering
experiments.

DATA EXAMPLES:

1. The family of disintegrins (proteins from snake venom) with tested
activity.

Name                   Sequence (part)                     Activity*
                 41        51        61        71        81
Trigramin alpha  QCGEGLCCDQCSFIEEGTVCRIARGDDLDDYCNGRSAGCPRNP  130
Albolabrin       .............MKK..I..R............I........  222
Elegantin        ..AD.......R.KKKR.I..R....NP..R.T.Q..D....G  136
Flavoridin       ..AD.......R.KKKTGI.......FP..R.T.L.ND...WN  100
Batroxastatin    ..A........R.KGA.KI..R....NP..R.T.Q..D....R  133
Applagin         ..A........L.MK.....-R.....VN.....I........   50
Kistrin          ........E..K.SRA.KI...P...MP..R.T.Q..D...YH  128
Echistatin alpha E.ES.P..RN.L.LK...I.LR.....M......LTCP.....   56
Bitistatin       ..NH.E.....K.KKAR.........WN....T.K.SD..W.H  237
Bitan alpha      ..NH.E.....R.KKA..........WN....T.K.SD..W.H  108

* - Activity is measured as the concentration of protein (in nM)
required to 50% attenuation of platelet-rich plasma aggregation
stimulated by adenosine-diphosphate.

2. The set of synthetic peptides with tested antimicrobial activity

Name         Activity*    Peptide sequence

Analog A2      400    GIHYLSHKSFSKFFAGVGKFTNS
Analog A1      100    GIHYLSHKSFSKFFAGVQKFTNS
Antisense P     60    GIHYLSHKSFSKFFCGVQKFTNS
Analog B1       40    GIHYLSHKSFSKFFKGVQKFTNS
Analog B2       40    GIHYLSHKSFSKFFKGVGKFTNS
Magainin 2      20    GIGKFLHSAKKFGKAFVGEIMNS
Analog C1       20    GIHKLSHKSFSKFFKGVQKFTNS
Analog C2       20    GIHKLSHKSFSKFFKGVGKFTNS
Analog P1       10    AIHNFAHKSFAKFFRAVKKFANA
Analog P2        5    AIHNLAHKSLAKLLRAVKKLANA
Analog P3        5    GIHNFAHKSFAKFFRAVKKFANS
Analog M2        3    KIHKLAHKLLKKLLKAVKKLAKA
* - Minimal inhibitory concentration (in mcg/ml) against E.coli

3. The set of unrelated peptides with tested immunogenicity.

-------------------------------------------------
Protein  Oncogene         Sequence      Immuno-
region                                  genicity*
-------------------------------------------------
409-425  C-SRC        RLIEDNEYTARQGAKFP     4
468-482  C-SRC        NREVLDQVERGYRMP       4
499-508  C-SRC        WRRDPEERPT            4
001-018  V-KI-RAS     MTEYKLVVVGASGVGKSA    5
119-135  V-KI-RAS     DLPSRTVDTKQAQELAR     5
161-175  V-KI-RAS     REIRQYRLKKISKEE       2
001-018  V-HA-RAS     MTEYKLVVVGARGVGKSA    4
001-018  C-HA(EJ)-RAS MTEYKLVVVGAVGVGKSA    3
001-018  C-HA-RAS     MTEYKLVVVGAGGVGKSA    5
029-044  V-HA-RAS     VDEYDPTIEDSYRKQV      4
091-108  V-HA-RAS     EDIHQYREQIKRVKDSDD    4
126-136  V-HA-RAS     ESRQAQALARS           4
146-155  V-HA-RAS     AKTRQGVEDA            5
160-179  V-HA-RAS     VREIRQHKLRKLNPPDESGP  5
011-024  V-MYB        PQESSKAGPPSGTT        4
033-047  V-MYB        MAFAHNPPAGPLPGA       3
146-162  V-MYB        DNTRTSGDNAPVSCLGE     4
168-186  V-MYB        PSPPVDHGCLPEESASPAR   4
170-185  V-MYB        PPVDHGCLPEESASPA      2
247-260  V-MYB        PFHKDQTFTEYRKM        4
247-265  V-MYB        PFHKDQTFTEYRKMHGGAV   4
541-555  V-FES        RHSTSSSEQEREGGR       4
584-593  V-FES        PEVQKPLHEQ            4
782-796  V-FES        FLRTEGARLRMKTLL       4
840-846  V-FES        SREAADG               0
893-905  V-FES        ASPYPNLSNQQTR         3
901-913  V-FES        NQQTREFVEKGGR         4
222-234  V-MYC        PPTTSSDSEEEQE         0
323-334  V-MYC        RTLDSEENDKRR          4
340-350  V-MYC        ERQRRNELKLR           4
363-371  V-MYC        NNEKAPKVV             1
389-403  V-MYC        RLIAEKEQLRRRREQ       4
395-405  V-MYC        EQLRRRREQLK           4
400-406  V-MYC        RREQLKH               0
* logarithm of antipeptide antibody titers.

4. Phenotype-genotype correlations. Influenza A virus M2 protein from
strains sensitive (labeled "sen") and resistant to amantadine or
rimantadine ("res").

Strain  Sensitivity    Sequence  (N-terminal part only)

PR8-34   res  MSLLTEVETPIRNEWGCRCNGSSDPLAIAANIIGILHLILWILDR
MON88    res  ....................D.................T......
LEN3-83  res  ..........................T...........T......
MOS88    res  ..........................T...........T......
MON86    res  ..........................T...........T......
SVER82   res  ..........................T...........T......
WS33     res  ....................D.....V..................
LEN85    res  ....................D.....VV.................
WSN33    res  ....................D....FV..................
LEN49    res  ....................D.....VV..........T......
LEN6-83  res  ....................D...S.VV..S..............
SWONT81  res  ....................D.....VA..S..............
SW29-37  res  ....................D.....VA..S..............
SWIA30   res  ..........T.........D.....VA..S..............
SWWIS61  res  ..........T.S.......D.....VA..S..............
SWIA88   res  .................K..D.....VAV.S..............
AA60     sen  ....................D.....VV..S.............H
KOREA68  sen  ....................D.....VV..S......F.......
BANG79   sen  ....................D.....VV..S..............
FW50     sen  ....................D.....VV..S..............
MEM88    sen  ....................D.....VV..S..............
USSR77   sen  .............Q......D.....VV..S..............
PINALB79 sen  ..........T..G.E.K.SD.....V...S..............
SWHK82   sen  ..........T..G.E.K.SD.....V...S..............
SWNED85  sen  ..........T..G....FSD.....V...S..............
FPVR34   sen  ..........T..G.E....D.....I...S............N.
MLRDNY78 sen  ..........T..G.E.K.SD.....V...S..............
TYMN81   sen  ..........T..G.E.K.SD.....V...S..............
TYMN80   sen  ..........T..G.E.K.SD.....V...S..............
CKVIC85  sen  ..........T..G.E.K.SD.....V...S..............

ProAnWin IS USEFUL IN:

- protein structure-function and structure-activity investigations;
- designing proteins and peptides with improved activity;
- making multiple protein alignments and getting sense from it;
- studying phenotype-genotype correlations;
- preparation of protein 3D pictures with sites highlighted;
- comparative protein sequence analysis.

AVAILABILITY:

ProAnWin is available (as self-extracted archive) from EBI
software library:
ftp://ftp.ebi.ac.uk/pub/software/dos/proanwin
and, in Eastern Hemisphere, from NSC software library:
ftp://ftp.bionet.nsc.ru/pub/biology/vector/proanwin.dem/paw$.exe
The version is limited in number of analyzed sequences.

INSTALLATION:

The files required to run ProAnWin are distributed in the form of a
single compressed file. Create a directory "PROANWIN" in your hard
disk, for example, C. Copy the file to the directory, run the file
from DOS prompt and answer Yes to all questions. To start the program
run PROAWIN.EXE from windows.

PROGRAM CONTENT:

Directory:
Main directory  - program modules
DATA            - examples of data and output files;
                  amino acid physico-chemical properties (>50);
                  manual
ALIGNS          - 50 aligned protein family sequences

PUBLICATIONS:

1. Eroshkin A.M., Zhilkin P.A., Fomin V.I. Algorithm and computer
program PROANAL for analysis of relationship between structure and
activity in a family of proteins or peptides. CABIOS, 1993, 9,
491-497.
2. Eroshkin A.M., Minenkova O.O., Fomin V.A., Ivanisenko V.A.,
Ilyichev A.A.  Analysis of peptide fragment insertions into major
coat protein of bacteriophages M13, f1 and fd. Relation of protein
structural characteristics and viability of mutant phages. Molec.
Biology (Russia), 1993, 27, 1345-1355.
3. Eroshkin A.M., Fomin V.I., Zhilkin P.A., Ivanisenko V.A.,
Kondrakhin Y.V.  PROANAL version 2: multifunctional program for
analysis of multiple protein sequence alignments and studying
structure-activity relationships in protein families. CABIOS, 1995,
11, 39-44.
4. Morozov B.M., Ivanisenko V.A., Eroshkin A.M., Ugarova N.N.
Analysis of relations between bioluminescence color and the structure
of beetle luciferases: identification of the sites influencing
bioluminescence color. Molec. Biology (Russia), in press.

Comments, bug reports, suggestions for new features are welcome
and should be sent by e-mail to: Alexey Eroshkin

OTHER TOOLS AVAILABLE:

ProAnalyst, Multifunctional analysis of protein sequences and
structures (MS-DOS version of ProAnWin with additional functionality:
searching motifs, physico-chemical plots, alphabetical and
physico-chemical analysis of protein sequence variation,
structure-activity determination profile, etc.):
IUBio archive: ftp://iubio.bio.indiana.edu/molbio/ibmpc/panalys1
EMBL library: ftp://ftp.ebi.ac.uk/pub/software/dos/proanalyst
NSC library: ftp://ftp.bionet.nsc.ru/pub/biology/vector/proanaly.dem/panalys$

ProMSED, Protein Multiple Sequences EDitor for MS Windows 3.x/95 ("a
la" Word for Windows style + ClustalV + manual alignment + amino acid
coloring + more):
EMBL library: ftp://ftp.ebi.ac.uk/pub/software/dos/promsed
NSC library: ftp://ftp.bionet.nsc.ru/pub/biology/vector/promsed.dem/promsed$
IUBio archive: ftp://iubio.bio.indiana.edu/molbio/ibmpc/promsed1

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Dr. Alexey Eroshkin               Institute of Molecular Biology
E.mail: eroshkin at vector.nsk.su    State Research Center of Virology and
Tel: +7 (3832) - 647774           Biotechnology "Vector"
Fax: +7 (3832) - 328831           Koltsovo, Novosibirsk Region 633159
                                  Russia
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



More information about the Schisto mailing list

Send comments to us at biosci-help [At] net.bio.net