Mumps (the language)
okane at cs.uni.edu
Mon Jun 24 04:29:20 EST 2002
Home Page: http://www.cs.uni.edu/~okane
The Mumps language originated in the mid-60's at the Massachusetts General Hospital.
The acronym stands for "Massachusetts General Hospital Utility Multi-Programming System".
While it has been used in a number of areas, its primary application is to medicine.
Although most implementations are proprietary, consolidated into the hands of a small
number of vendors, an open source version of the language has been developed which is
distributed freely under the GNU GPL and LGPL licenses.
Mumps is potentially attractive for bioinformatics applications because:
- It supports an hierarchical data base facility. Mumps data sets are not only organized
according to traditional sequential and direct access models, but also as hierarchical trees
whose data nodes are addressed as many-level path descriptions in a manner that is easy for
a programmer to master in a relatively short time.
- The data base can also be viewed as string-indexed, many-dimensional matrices of effectively
- The underlying data base processor, the Berkeley DB, can be configured for data bases up to
256 terabytes in size.
- Mumps has flexible and powerful string manipulation facilities. Its built-in string manipulation
operators and functions, which include the Perl Compatible Regular Expression Library, permit
complex string manipulation and pattern matching operations.
- This version of Mumps, unlike all others, is a compiler that translates Mumps code to C.
Mumps subroutines can be constructed so that they can be called by any other program that obeys
the C calling conventions. Each Mumps subroutine is fully functional and requires no additional
interpreters, main programs, language processors, etc, other than ordinary link libraries.
Similarly, Mumps programs and subroutines can call any other system facility that uses a C
calling structure. These features are unique to this version of Mumps and makes it possible
to exploit Mumps' features in non-Mumps contexts.
- The data base can operate in standalone or client-server mode. In standalone mode, multiple
programs simultaneously access the same data base files. In client-server mode, Mumps client
programs or functions access local or remote Mumps data bases through TCP/IP or UDP protocols.
TCP/IP connections have the option of using OpenSSL encryption. These are compile time switch
options and require no specific program modifications to use.
- Mumps programs can be used with the open source Gtk based Glade "drag and drop" GUI builder.
This permits rapid deployment of user friendly GUI interfaces (see references below for examples).
- Mumps routines can be used to easily construct CGI executable scripts for data base access. Mumps
programs can be called directly by the web server and have builtin facilities to parse the QUERY_STRING
web server environment variable to instantiated program variables and data (see references).
- Direct SQL commands can access PostgreSQL RDBMS data bases (can be modified for MySQL) with
the results archived to native tables (matrices) or trees.
Initial testing has been done using Mumps in connection with the NCBI BLAST software
(ftp.ncbi.nih.gov/blast/demo). In the test, data were moved directly from the "doblast"
example output routines to a Mumps tree-structured data base and subsequently accessed
without problems. There appear to be no compatibility issues involved with using Mumps
with the NCBI Toolkit. The prototype code is given in the references below. It
demonstrates, albeit somewhat trivially, an easy way to organize sequence matching data
We would be very interested in any suggestions regarding how we might extend this work to make
it more useful for bioinformatic applications as well as suggestions for demonstration projects.
As noted, all the software is open source and GNU GPL/LGPL. The main web page for this work, which
includes coding examples, manuals and so forth, is:
The direct link to the documentation is:
The link to the BLAST example is:
The source code is at:
The main development and testing vehicle is Linux.
Kevin C. O'Kane
Department of Computer Science
University of Northern Iowa
Cedar Falls, IA 50614-0507
okane at cs.uni.edu
More information about the Bio-www