protein homology search
olam at radium.uio.no
Tue Oct 25 05:15:50 EST 1994
In article <paul_b-201094094729 at clone2.mcb.uconn.edu>,
paul_b at biotek.mcb.uconn.edu (paul betts) wrote:
> Does anyone out there know of a database, or software to search a protein
> sequence database, or any other strategy that will allow a search based on
> protein molecular weight and/or pI?
What about the MOWSE server?
I believe I got this info and more by sending a mail with the line "help"
to mowse at dl.ac.uk (Mowse Server).
Date: Wed, 13 Oct 93 10:58:55 +0100
From: mowse at dl.ac.uk (Mowse Server)
Apparently-To: ola.myklebost at labmed.uio.no
The MOWSE peptide mass database:
Imperial Cancer Research Fund
SERC Daresbury Laboratory
D.J.C. Pappin, P. Hojrup and A.J. Bleasby
'Rapid Identification of Proteins by
Current Biology (1993), vol 3, 327-332.
InterNet server version:
Table of Contents:
 Construction of the MOWSE database.
[2.1] Source database.
[2.2] Calculation of Molecular weight fragments.
 Running database searches via e_mail.
 Example of mail query format.
 Results listing.
 Database structure.
[6.1] MOWSE database structure.
[6.2] The MW primary fragment molecular weight file.
[6.3] The MDX file OWL entry index.
[6.4] The SMW whole sequence molecular weight file.
[6.5] Program Requirements.
[6.6] MOWSE Scoring scheme.
[6.7] Simulation studies.
 General references.
Determination of molecular weight has always been an
important aspect of the characterization of biological molecules.
Protein molecular weight data, historically obtained by SDS gel
electrophoresis or gel permeation chromatography, has been used
establish purity, detect post-translational modification (such as
phosphorylation or glycosylation) and aid identification. Until
just over a decade ago, mass spectrometric techniques were typically
limited to relatively small biomolecules, as proteins and nucleic
acids were too large and fragile to withstand the harsh physical
processes required to induce ionization. This began to change with
the development of 'soft' ionization methods such as fast atom
bombardment (FAB), electrospray ionisation (ESI) [2,3] and
matrix-assisted laser desorption ionisation (MALDI), which can
effect the efficient transition of large macromolecules from
solution or solid crystalline state into intact, naked molecular
ions in the gas phase. As an added bonus to the protein chemist,
sample handling requirements are minimal and the amounts required
for MS analysis are in the same range, or less, than existing
As well as providing accurate mass information for intact
proteins, such techniques have been routinely used to produce
accurate peptide molecular weight 'fingerprint' maps following
digestion of known proteins with specific proteases. Such maps
have been used to confirm protein sequences (allowing the
detection of errors of translation, mutation or insertion),
characterise post-translational modifications or processing events
and assign disulphide bonds [5,6].
Less well appreciated, however, is the extent to which such
peptide mass information can provide a 'fingerprint' signature
sufficiently discriminating to allow for the unique and rapid
identification of unknown sample proteins, independent of other
analytical methods such as protein sequence analysis.
The following text describes the construction and use
of the MOWSE peptide mass database (for MOlecular Weight SEarch)
at the SERC Daresbury Laboratory. Practical experience has shown
that sample proteins can be uniquely identified using as few as 3-
4 experimentally determined peptide masses when screened against a
fragment database derived from over 50,000 proteins. Experimental
errors of a few Daltons are tolerated by the scoring algorithms,
permitting the use of inexpensive time-of-flight mass
spectrometers. As with other types of physical data, such as amino
acid composition or linear sequence, peptide masses can clearly
provide a set of determinants sufficiently unique to identify or
match unknown sample proteins. Peptide mass fingerprints can prove
as discriminating as linear peptide sequence, but can be obtained
in a fraction of the time using less material. In many cases, this
allows for a rapid identification of a sample protein before
committing to protein sequence analysis. Fragment masses also
provide structural information, at the protein level, fully
complementary to large-scale DNA sequencing or mapping projects
Ola Myklebost Email ola.myklebost at labmed.uio.no
Dept of Tumor Biology
Inst for Cancer Research Tel +47-2293-4299
The Norwegian Radium Hospital Fax +47-2252-2421
N-0310 OSLO, Norway
More information about the Bio-soft