FASTLINK 2.3P available, runs in parallel

Alex Schaffer schaffer at cs.rice.edu
Fri Aug 4 09:37:11 EST 1995


The purpose of this note is to announce the availability of FASTLINK 2.3P.

FASTLINK is a faster version of the principal linkage analysis programs
in LINKAGE 5.1.

Thanks to Lucien Bachner, Carolyn Bucholtz, John Powell, Gyorgy Simon,
Jim Tomlin, and Garret Taylor for assistance with beta testing and
portability testing of the parallel code.

Thanks to  Margaret Gelder Ehm, Carol Haynes, Patricia Kramer,
Toby Nygaard, Marcy Speer, Gerard Tromp, Frank Visser for
bug reports and suggestions that helped in developing version 2.3P.

Thanks to Anita Destefano and Kimmo Kallio for assistance with portability
to VMS.

As with previous versions, FASTLINK 2.3P can be ftp-ed from
softlib.cs.rice.edu in the directory
         pub/fastlink

The main advance over FASTLINK 2.2 is that much of the code can now
run in parallel, which explains the P in the new version. Most of this
message will focus on the parallel code, but let me put some remarks
about the sequential code first, so that those who want only sequential
code can skip the rest.

|*| Sequential Code
    ---------------

Version 2.3P has some improvements in the sequential code
and documentation, which are covered at the end of README.updates.
New features include speeding up runs involving multiple LINKMAP scripts
that move a marker across a fixed map.
The organization of the code files and Makefile has changed substantially, so
that the same files can be used to make both sequential and parallel 
executable files. 
There is a new auxiliary program called ofm ("optimize for maxhap") to
assist with automatic recompilation of the programs.

|*| Parallel Code, Introduction
    ---------------------------
The (sequential) FASTLINK package already provides considerable running
time improvements over the older programs for the LINKAGE package. 
Response from users about sequential FASTLINK has been extremely
enthusiastic and yet, it is abundantly clear that more speedup is necessary.
At this time, we believe that one realistic way to obtain substantially
more speedup on long runs is to use multiple processors in parallel. 
We continue to investigate further sequential speedups.

Two attempts to parallelize  ILINK from FASTLINK are described in the papers:

3. Sandhya Dwarkadas, Alejandro A. Schaffer, Robert W. Cottingham Jr.,
   Alan L. Cox, Peter Keleher, and Willy Zwaenepoel, Parallelization of
   General Linkage Analysis Problems, Human Heredity 44(1994),
   pp. 127--141.

4. Sandeep K. Gupta, Alejandro A. Schaffer, Alan L. Cox, Sandhya
   Dwarkadas, and Willy Zwaenepoel, Integrating Parallelization
   Strategies for Linkage Analysis, Computers and Biomedical Research
   28(1995), pp. 116-139.

These two papers are available as paper3.ps and paper4.ps with the
distribution.  The version of parallel ILINK that we are distributing
is similar algorithmically to that described in the second paper.  We
were able to achieve speedups in the 5 to 7 range on a network of 8
DECStation5000/Ultrix processors on ILINK runs that take tens of
minutes sequentially. 

|*| Parallel FASTLINK, Operation
    ----------------------------

FASTLINK 2.3P can be run in parallel on two different types of platforms:
shared-memory multiprocessors and networks of UNIX workstations.

FASTLINK on shared-memory multiprocessors:

The shared-memory version uses the p4 macros which are available by
anonymous ftp to Argonne National Labs. More detailed retrieval and
installation instructions can be found in README.p4 that comes with
FASTLINK.

FASTLINK on network of workstations:

If you have access to a network of (uniprocessor) Unix workstations,
then you can run parallel FASTLINK using a runtime package called
TreadMarks. TreadMarks essentially provides the same execution
environment on a network of workstations as that available on
shared-memory multiprocessors.

TreadMarks is available for a small fee for universities and medical
schools, and at commercial rates for other institutions. All users
can get a 30-day free trial license. See README.TreadMarks for
more details on how to configure FASTLINK with TreadMarks.


TreadMarks licenses can be obtained by sending e-mail to 
treadmarks at ece.rice.edu. Please specify the nature of your organization
(commercial or university/medical school) and the machine architecture
and operating system you plan to use TreadMarks for. Once you return the
signed license and the license fee, a copy of TreadMarks will be sent
to you or be made available via ftp. A free 30-day demo copy can also
be obtained by sending e-mail to the same address.

We recognize that installing the parallel code is more 
difficult that installing the sequential code because of the need to
configure both the FASTLINK code and the runtime library (either p4 or
TreadMarks). We will be pleased to work with you in getting the parallel
code up and running on your system.

|*| Parallel FASTLINK, References

The main references for sequential FASTLINK are:

 1. R. W. Cottingham Jr., R. M. Idury, and A. A. Schaffer, Faster Sequential 
 Genetic Linkage Computations, American Journal of Human Genetics, 53(1993),
 pp. 252-263.


 2. A. A. Schaffer, S. K. Gupta, K. Shriram, and R. W. Cottingham, Jr.,
 Avoiding Recomputation in Genetic Linkage Analysis, Human Heredity,
 44(1994), pp. 225-237.

 5. G. M. Lathrop, J.-M. Lalouel, C. Julier, and J. Ott, Strategies for
 Multilocus Analysis in Humans, PNAS 81(1984), pp. 3443-3446.

 6. G. M. Lathrop and J.-M. Lalouel, Easy Calculations of LOD Scores
 and Genetic Risks on Small Computers, American Journal of Human Genetics,
 36(1984), pp. 460-465.

 7. G. M. Lathrop, J.-M. Lalouel, and R. L. White, Construction of Human
 Genetic Linkage Maps: Likelihood Calculations for Multilocus Analysis,
 Genetic Epidemiology 3(1986), pp. 39-52.

One reference for p4 is:

 8. R. Butler and E. Lusk. Monitors, Messages and Clusters: The p4
 Parallel Programming System, Parallel Computing 20(1994), pp. 547-564.

One reference for TreadMarks is:

 9. P. Keleher, A.L. Cox, S.Dwarkadas, and W. Zwaenepoel,
 TreadMarks: Distributed Shared Memory on Standard Workstations
 and Operating Systems, Proceedings of the Winter 94 Usenix Conference,
 pp. 115-131, January 1994.

FASTLINK 2.3P represents the conjunction of 5 substantial research
efforts and software engineering projects. Therefore, if you use 
FASTLINK in parallel, we ask that you cite:

at least one of 5,6,7 to give credit to LINKAGE
at least one of 1,2 to give credit to sequential FASTLINK
at least one of 3,4 to give credit for the parallel algorithms and
either 8 (p4) or 9 (TreadMarks) to give credit for the runtime library
  that you use.


Sincerely,
Chris Hyams and Alejandro Schaffer and Alan Cox 
and Sandhya Dwarkadas and Willy Zwaenepoel
Rice University
cgh at cs.rice.edu
schaffer at cs.rice.edu
alc at cs.rice.edu
sandhya at cs.rice.edu
willy at cs.rice.edu




More information about the Gen-link mailing list