Plug-in modules for gap4 (long)
jkb at mrc-lmb.cam.ac.uk
Thu Dec 12 10:46:06 EST 1996
(I apologise in advance for the level of technical detail in this
Part of the the current work on gap4 (and, later, other programs) is
in tidying up and documenting the scripting system. The final aims of
this is to allow people to easily integrate their scripts and programs
in a more seamless and robust manner than before. These scripts are
much more flexible than sending answers to questions posed by the
likes of the old gap program. Rather a complete language (tcl) is used.
The notion of scripts can be taken one step further for allowing
complete new components of the programs to be "plugged in". The end
user would simply see new options on the gap4 menus that operate in
exactly the same manner as the existing options. Adding new modules
would be as simple as adding a single line to your .gaprc file.
To the developer, the modules must be compiled (they can consist of
both compiled object files (eg C or Fortran) and Tcl) and linked in
particular ways, making use of the documented programming interfaces.
The benefits are many, of which the obvious ones are:
a) Ease of use for users.
b) Ability to use other groups extensions without needing to recompile
or write messy scripts.
c) Mechanism allows commercial distribution of extensions, with no
need for sources.
d) In the long term, greater choice (Eg, of assembly engines) without
the need for detailed unix knowledge.
The current stage of development with this system does allow for this
seamless plug-in for most systems, but this is somewhat tricky for
some systems (SunOS 4 being the worst culprit here). As a test, we've
successfully reintegrated the Cap2 and Fakii assembly engines as
The reason for this news/mail is to get feedback on the likely uses,
and hopefully suggestions for improvements. I'll explain briefly the
system as it stands.
Each module consists of an 'rc' file in the $STADENROOT/tables directory
containing commands to add items to the menus. This file can also
specify various tunable parameters (such as those found in gaprc) and
dependencies on other modules.
To make use of a module a "load_package module_name" command is then
added to (eg) the .gaprc file. In the simplest form, this adds
$STADENROOT/lib/module_name to the tcl search path and, if present,
dynamically loads the $STADENROOT/lib/libmodule_name.so library and
calls the module_name_init routine.
We've now come to a deciding point in our development for how this
mechanism should be used for our programs. As I can see things there
are three obvious strategies. I'll take gap4 as an example.
1) Carry on building gap4sh as currently (gap4sh is the extended wish
interpreter; gap4 itself is just a script that runs gap4sh -f
newgap_top.tcl). The current mechanism is that our own libraries
(io_lib, tk_utils, Misc, etc) are static archives, copied into the
gap4sh executable at compile time.
2) Build our own libraries (tk_utils etc) as dynamic libraries, and
then build gap4sh as before, except against these dynamic copies. This
has the problem that we need to build the dynamic libraries, which
varies slightly on each machine, but is perfectly doable.
3) Build our libraries as dynamic, and use a standard wish style shell
that loads all these libraries at run time as modules using the above
mentioned scheme. This works fine on alpha, solaris 2 and sgi systems
(I haven't tested linux or windows yet), but is not on SunOS 4. I
think this is mainly due to a bug in the SunOS 4 implementation of
dlopen (or my understanding of it) that is potentially possible to
Does method 3 provide any improvements over method 2 for anyone?
Personally I cannot see many obvious advantages other than that it
serves as an example of how to create modules. One other advantage is
that method 3 allows for single process applications that combine the
features of one or more programs. An example of this would be to have
the nucleotide analysis routines available directly from within
gap4 without needing to recompile. Perhaps, a better method for this
would be program interaction/communication.
The problems I've noticed with method 3 would also be met by any third
parties writing their own modules to run on this many machines, but
things here are complicated further by the odd remnant of Fortran
code. So far I haven't noticed anything that makes this method
impossible on any of the machines I've tried.
Your comments and suggestions are most welcome,
James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/
More information about the Staden