Mumps (the language), hierarchical and multi-dimensional data bases
okane at cs.uni.edu
Mon Jun 24 04:28:45 EST 2002
The Mumps language originated in the mid-60's at the Massachusetts General Hospital.
The acronym stands for "Massachusetts General Hospital Utility Multi-Programming System".
While it has been used in a number of areas but its primary application is to medicine.
While the number of proprietary implementations has consolidated into the hands of a small
number of companies, we have developed an open source version of the language which is
distributed freely under the GNU GPL and LGPL licenses.
Mumps is potentially attractive for bioinformatics applications because:
- It supports an hierarchical data base facility. Mumps data sets are not only organized
along traditional sequential and direct access methods, but also as hierarchical trees
whose data nodes are addressed as path descriptions in a manner which is easy for a programmer
to master in a relatively short time.
- The data base can also be viewed as string-indexed, multidimensional matrices of effectively
- The underlying data base processor, the Berkeley DB, can be configured for data bases up to
256 terabytes in size.
- It has flexible and powerful string manipulation facilities. Mumps built-in string manipulation
operators and functions, which include the Perl Compatible Regular Expression Library, permit
complex string manipulation and pattern matching operations.
- This version of Mumps, unlike all others, is a compiler that translates Mumps code to C.
Mumps subroutines can be constructed which can be called by any other program that obeys
the C calling conventions. Similarly, Mumps programs and subroutines can call any other
system facility that uses a C calling structure. This feature is unique to this version of
- The data base can operate in standalone or client-server mode. In standalone mode, multiple
programs can simultaneously access the same data base files. In client-server mode, Mumps
client routines can access local or remote Mumps data bases through TCP/IP or UDP connections.
TCP/IP connections have the option of using OpenSSL encryption. These are compile time switch
options and require no specific program modifications to use.
- Mumps programs can be used with the Gtk based Glade "drag and drop" GUI builder. This permits
rapid deployment of user friendly GUI interfaces (see references below for examples).
- Mumps routines can be used to easily construct CGI scripts for data base access. Mumps programs
can be called directly by the web server and have builtin facilities to parse the QUERY_STRING
environment variable to instantiated program variables and data (see references).
- There are builtin commands to access PostgreSQL RDBMS data bases (can be modified for MySQL).
We have done some initial testing of using Mumps in connection with the NCBI BLAST
software. It the test, we moved data directly from the "doblast" example output
routines to a Mumps data base. (ftp.ncbi.nih.gov/blast/demo) without problems. As
a result, there appear to be no compatibility issues. An example is given in the
We would be very interested in any suggestions regarding how we might extend this work to
make it more useful for bioinformatic applications.
All the software is open source and GUN GPL/LGPL. The main web page for this work, which
includes coding examples, manuals and so forth, is:
The direct link to the documentation is:
The link to the BLAST example is:
The source code is at:
The main development vehicle is Linux.
Kevin C. O'Kane
Department of Computer Science
University of Northern Iowa
Cedar Falls, IA 50614-0507
(319) 273 7322 (Office + Voice Mail)
(319) 266 4131 (Iowa)
(508) 778 9485 (Massachusetts)
okane at cs.uni.edu
More information about the Bio-soft