To: schisto at net.bio.net
From: Alexey Eroshkin <eroshkin at vector.nsk.su>
Subject: ANNOUNCE ProAnWin: protein alignment and structure-activity analysis
**************************************
ProAnWin - Protein Analyst for Windows
**************************************
State Research Center of Virology and Biotechnology
Koltsovo, Novosibirsk Region, 633159 Russia
and
Irina Pika, Anatoly Frolov, Vladimir Ivanisenko with Alexey Eroshkin
are pleased to announce the availability of new MS Windows
application for multiple protein sequence alignment, comparative
sequences analysis, studying protein structure-activity
(property/phenotype) relationships and designing site-directed
mutagenesis.
DESCRIPTION:
ProAnWin studies the relationships between protein/peptide activity
(or property or related phenotype) and characteristics of some
regions in primary or tertiary structure of these molecules.
Structure-activity analysis is based on the sequences of protein
family, data on protein activity (pK, ED50, Km or any other) and, if
available, 3D structure of one of these proteins (supposing the
common 3D fold for all the homologs). The main aim is to find out the
factors responsible for the variation of protein activities: location
of activity-modulating site and important structural characteristics
of the site.
The program makes the following: input of sequences from several
formats (SWISS-PROT, PIR, FASTA, GCG, CLUSTAL) and 3D structure in
PDB format; flexible multiple protein sequences alignment and
threading sequences into known 3D structure (ClustalV + manual
alignment); input of user-defined protein activities, properties or
related phenotypes (with possibility to transform activity: log(x),
1/x, etc.); calculation of many characteristics (hydrophobicity,
amphipathicity, etc.) of linear and spatial protein sites; fast
multiple (up to eight independent factors) linear regression analysis
of structure-activity relationships; activity prediction for untested
or mutated proteins; data visualization (regression plots, 3D
pictures with sites highlighted, multiple alignments); displaying
found sites on sequences and 3D structure. The program has two main
related windows - with protein sequences and with 3D structure; any
site highlighted in sequences is highlighted in 3D structure and vise
versa.
ProAnWin aligns complete set of sequences, subset or any selected
block, providing thus possibility for iterative alignment that
preserve some previously found blocks or those imposed from some
biological data (active center, catalytic residues).
The program can be applied to analysis of various protein-related
biological data, to prediction of activity (phenotype) of newly
sequenced proteins and to simulation of protein-engineering
experiments.
DATA EXAMPLES:
1. The family of disintegrins (proteins from snake venom) with tested
activity.
Name Sequence (part) Activity*
41 51 61 71 81
Trigramin alpha QCGEGLCCDQCSFIEEGTVCRIARGDDLDDYCNGRSAGCPRNP 130
Albolabrin .............MKK..I..R............I........ 222
Elegantin ..AD.......R.KKKR.I..R....NP..R.T.Q..D....G 136
Flavoridin ..AD.......R.KKKTGI.......FP..R.T.L.ND...WN 100
Batroxastatin ..A........R.KGA.KI..R....NP..R.T.Q..D....R 133
Applagin ..A........L.MK.....-R.....VN.....I........ 50
Kistrin ........E..K.SRA.KI...P...MP..R.T.Q..D...YH 128
Echistatin alpha E.ES.P..RN.L.LK...I.LR.....M......LTCP..... 56
Bitistatin ..NH.E.....K.KKAR.........WN....T.K.SD..W.H 237
Bitan alpha ..NH.E.....R.KKA..........WN....T.K.SD..W.H 108
* - Activity is measured as the concentration of protein (in nM)
required to 50% attenuation of platelet-rich plasma aggregation
stimulated by adenosine-diphosphate.
2. The set of synthetic peptides with tested antimicrobial activity
Name Activity* Peptide sequence
Analog A2 400 GIHYLSHKSFSKFFAGVGKFTNS
Analog A1 100 GIHYLSHKSFSKFFAGVQKFTNS
Antisense P 60 GIHYLSHKSFSKFFCGVQKFTNS
Analog B1 40 GIHYLSHKSFSKFFKGVQKFTNS
Analog B2 40 GIHYLSHKSFSKFFKGVGKFTNS
Magainin 2 20 GIGKFLHSAKKFGKAFVGEIMNS
Analog C1 20 GIHKLSHKSFSKFFKGVQKFTNS
Analog C2 20 GIHKLSHKSFSKFFKGVGKFTNS
Analog P1 10 AIHNFAHKSFAKFFRAVKKFANA
Analog P2 5 AIHNLAHKSLAKLLRAVKKLANA
Analog P3 5 GIHNFAHKSFAKFFRAVKKFANS
Analog M2 3 KIHKLAHKLLKKLLKAVKKLAKA
* - Minimal inhibitory concentration (in mcg/ml) against E.coli
3. The set of unrelated peptides with tested immunogenicity.
-------------------------------------------------
Protein Oncogene Sequence Immuno-
region genicity*
-------------------------------------------------
409-425 C-SRC RLIEDNEYTARQGAKFP 4
468-482 C-SRC NREVLDQVERGYRMP 4
499-508 C-SRC WRRDPEERPT 4
001-018 V-KI-RAS MTEYKLVVVGASGVGKSA 5
119-135 V-KI-RAS DLPSRTVDTKQAQELAR 5
161-175 V-KI-RAS REIRQYRLKKISKEE 2
001-018 V-HA-RAS MTEYKLVVVGARGVGKSA 4
001-018 C-HA(EJ)-RAS MTEYKLVVVGAVGVGKSA 3
001-018 C-HA-RAS MTEYKLVVVGAGGVGKSA 5
029-044 V-HA-RAS VDEYDPTIEDSYRKQV 4
091-108 V-HA-RAS EDIHQYREQIKRVKDSDD 4
126-136 V-HA-RAS ESRQAQALARS 4
146-155 V-HA-RAS AKTRQGVEDA 5
160-179 V-HA-RAS VREIRQHKLRKLNPPDESGP 5
011-024 V-MYB PQESSKAGPPSGTT 4
033-047 V-MYB MAFAHNPPAGPLPGA 3
146-162 V-MYB DNTRTSGDNAPVSCLGE 4
168-186 V-MYB PSPPVDHGCLPEESASPAR 4
170-185 V-MYB PPVDHGCLPEESASPA 2
247-260 V-MYB PFHKDQTFTEYRKM 4
247-265 V-MYB PFHKDQTFTEYRKMHGGAV 4
541-555 V-FES RHSTSSSEQEREGGR 4
584-593 V-FES PEVQKPLHEQ 4
782-796 V-FES FLRTEGARLRMKTLL 4
840-846 V-FES SREAADG 0
893-905 V-FES ASPYPNLSNQQTR 3
901-913 V-FES NQQTREFVEKGGR 4
222-234 V-MYC PPTTSSDSEEEQE 0
323-334 V-MYC RTLDSEENDKRR 4
340-350 V-MYC ERQRRNELKLR 4
363-371 V-MYC NNEKAPKVV 1
389-403 V-MYC RLIAEKEQLRRRREQ 4
395-405 V-MYC EQLRRRREQLK 4
400-406 V-MYC RREQLKH 0
* logarithm of antipeptide antibody titers.
4. Phenotype-genotype correlations. Influenza A virus M2 protein from
strains sensitive (labeled "sen") and resistant to amantadine or
rimantadine ("res").
Strain Sensitivity Sequence (N-terminal part only)
PR8-34 res MSLLTEVETPIRNEWGCRCNGSSDPLAIAANIIGILHLILWILDR
MON88 res ....................D.................T......
LEN3-83 res ..........................T...........T......
MOS88 res ..........................T...........T......
MON86 res ..........................T...........T......
SVER82 res ..........................T...........T......
WS33 res ....................D.....V..................
LEN85 res ....................D.....VV.................
WSN33 res ....................D....FV..................
LEN49 res ....................D.....VV..........T......
LEN6-83 res ....................D...S.VV..S..............
SWONT81 res ....................D.....VA..S..............
SW29-37 res ....................D.....VA..S..............
SWIA30 res ..........T.........D.....VA..S..............
SWWIS61 res ..........T.S.......D.....VA..S..............
SWIA88 res .................K..D.....VAV.S..............
AA60 sen ....................D.....VV..S.............H
KOREA68 sen ....................D.....VV..S......F.......
BANG79 sen ....................D.....VV..S..............
FW50 sen ....................D.....VV..S..............
MEM88 sen ....................D.....VV..S..............
USSR77 sen .............Q......D.....VV..S..............
PINALB79 sen ..........T..G.E.K.SD.....V...S..............
SWHK82 sen ..........T..G.E.K.SD.....V...S..............
SWNED85 sen ..........T..G....FSD.....V...S..............
FPVR34 sen ..........T..G.E....D.....I...S............N.
MLRDNY78 sen ..........T..G.E.K.SD.....V...S..............
TYMN81 sen ..........T..G.E.K.SD.....V...S..............
TYMN80 sen ..........T..G.E.K.SD.....V...S..............
CKVIC85 sen ..........T..G.E.K.SD.....V...S..............
ProAnWin IS USEFUL IN:
- protein structure-function and structure-activity investigations;
- designing proteins and peptides with improved activity;
- making multiple protein alignments and getting sense from it;
- studying phenotype-genotype correlations;
- preparation of protein 3D pictures with sites highlighted;
- comparative protein sequence analysis.
AVAILABILITY:
ProAnWin is available (as self-extracted archive) from EBI
software library:
ftp://ftp.ebi.ac.uk/pub/software/dos/proanwin
and, in Eastern Hemisphere, from NSC software library:
ftp://ftp.bionet.nsc.ru/pub/biology/vector/proanwin.dem/paw$.exe
The version is limited in number of analyzed sequences.
INSTALLATION:
The files required to run ProAnWin are distributed in the form of a
single compressed file. Create a directory "PROANWIN" in your hard
disk, for example, C. Copy the file to the directory, run the file
from DOS prompt and answer Yes to all questions. To start the program
run PROAWIN.EXE from windows.
PROGRAM CONTENT:
Directory:
Main directory - program modules
DATA - examples of data and output files;
amino acid physico-chemical properties (>50);
manual
ALIGNS - 50 aligned protein family sequences
PUBLICATIONS:
1. Eroshkin A.M., Zhilkin P.A., Fomin V.I. Algorithm and computer
program PROANAL for analysis of relationship between structure and
activity in a family of proteins or peptides. CABIOS, 1993, 9,
491-497.
2. Eroshkin A.M., Minenkova O.O., Fomin V.A., Ivanisenko V.A.,
Ilyichev A.A. Analysis of peptide fragment insertions into major
coat protein of bacteriophages M13, f1 and fd. Relation of protein
structural characteristics and viability of mutant phages. Molec.
Biology (Russia), 1993, 27, 1345-1355.
3. Eroshkin A.M., Fomin V.I., Zhilkin P.A., Ivanisenko V.A.,
Kondrakhin Y.V. PROANAL version 2: multifunctional program for
analysis of multiple protein sequence alignments and studying
structure-activity relationships in protein families. CABIOS, 1995,
11, 39-44.
4. Morozov B.M., Ivanisenko V.A., Eroshkin A.M., Ugarova N.N.
Analysis of relations between bioluminescence color and the structure
of beetle luciferases: identification of the sites influencing
bioluminescence color. Molec. Biology (Russia), in press.
Comments, bug reports, suggestions for new features are welcome
and should be sent by e-mail to: Alexey Eroshkin
OTHER TOOLS AVAILABLE:
ProAnalyst, Multifunctional analysis of protein sequences and
structures (MS-DOS version of ProAnWin with additional functionality:
searching motifs, physico-chemical plots, alphabetical and
physico-chemical analysis of protein sequence variation,
structure-activity determination profile, etc.):
IUBio archive: ftp://iubio.bio.indiana.edu/molbio/ibmpc/panalys1
EMBL library: ftp://ftp.ebi.ac.uk/pub/software/dos/proanalyst
NSC library: ftp://ftp.bionet.nsc.ru/pub/biology/vector/proanaly.dem/panalys$
ProMSED, Protein Multiple Sequences EDitor for MS Windows 3.x/95 ("a
la" Word for Windows style + ClustalV + manual alignment + amino acid
coloring + more):
EMBL library: ftp://ftp.ebi.ac.uk/pub/software/dos/promsed
NSC library: ftp://ftp.bionet.nsc.ru/pub/biology/vector/promsed.dem/promsed$
IUBio archive: ftp://iubio.bio.indiana.edu/molbio/ibmpc/promsed1
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Dr. Alexey Eroshkin Institute of Molecular Biology
E.mail: eroshkin at vector.nsk.su State Research Center of Virology and
Tel: +7 (3832) - 647774 Biotechnology "Vector"
Fax: +7 (3832) - 328831 Koltsovo, Novosibirsk Region 633159
Russia
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++