bionet.molbio.gene-linkage FREQUENTLY ASKED QUESTIONS (part 1 of 3)

Conan the Librarian rootd at
Sun Nov 20 03:50:10 EST 1994


By Darrell Root
rootd at

FAQ admin information

   Where can I obtain the bionet.molbio.gene-linkage FAQ? 
   Who created the bionet.molbio.gene-linkage FAQ? 
   What other people contributed to this FAQ? 
   How can I help improve this FAQ? 
   What kind of information will never be contained in this FAQ? 

Information Resources

   What anonymous-ftp sites have programs/utilities useful for genetic
   linkage analysis? 
   I think I know the name of a program I want, but I don't know where I
   can find it 
   I have an ftp site with gene-linkage programs/utilities on it. How do I
   get registered with the archie servers? 
   What gopher sites have useful genetic-linkage information? 
   What books are helpful when learning about genetic linkage analysis? 
   What genetic-linkage databases are available on the internet? 
   What is WWW? 
   What is Mosaic? 
   What is lynx? 
   I can telnet to the internet. Can I access the web? 
   What www sites have useful genetic-linkage information? 
   What "linkage centers" make information and assistance available to
   What journals are useful for genetic-linkage analysis? 

Gene-linkage software overview

   What database management programs do people use for
   genetic-linkage data? 
   What programs are available for pedigree drawing? 
   Why are some programs used primairly for chromosome mapping,
   while others are used for disease-mapping? 
   What programs are used for chromosome mapping? 
   What programs are used for disease-gene mapping? 
   What programs are available to help detect errors in linkage data? 
   What is Cyrllic? 
   Programs to assist in the recoding of genetic markers 

Linkage package specific information

   How do you calculate MAXHAP? 
   When should you use binary coding instead of numeric allele coding? 
   What is the effect of having allele frequencies not add up to 1, eg.
   when some alleles are not present in a pedigree under study? 
   I use LINKAGE and/or FASTLINK. What references should I include
   in my papers? 
   A discussion of recoding alleles in linkage analysis 

Computer administration and optimization

   How can I increase the speed of the linkage/fastlink package on my
   I set up 300 megs of paging space on my workstation, but now I'm
   running out of hard-drive. Is there any way I can use my hard drive
   space more effeciently? 
   But I don't know how to do all this optimization, and my research
   assistant is spending all his/her time trying to figure it out. 
   How can I identify how much paging space is available on my

File format specification and conversion

   How do I convert between crimap and linkage formts? 
   How do I get my ceph data into crimap format? 

Educational resources for teaching genetics

   Genetics construction kit--fly genetics simulator 


   Where can I obtain the bionet.gene-linkage FAQ? [rootd;29may94]

   It is available by anonymous-ftp from: in

   The best way to view the faq is via the www, from 

   I also send the FAQ to news.answers, and to Dave Kristofferson, so it
   should be included in the "standard" FAQ archives. Of course, I won't
   be able to test that till after this goes out :-(

   Who created the bionet.molbio.gene-linkage FAQ? [rootd;19nov94]

   I am Darrell Root, and I'm editing this in my own time. Unfortunately,
   I don't have all that much free time, so this FAQ is sorta haphazard and
   has some obvious holes (for example, some of the "software packages
   for linkage analysis" answers point out ftp sites which are not included
   in the "ftp site list". In addition, I haven't double-checked much of the
   information which I received from people (and I may have made a typo
   or two), so if something appears incorrect, you're probably right. 

   Many thanks to everyone who sent me tons of information after the
   FAQ revision 1. Unfortunately, that's when things started to get "busy"
   and I'm just now doing the update (SIX MONTHS LATER). In
   addition, I moved the faq www site from to Sorry about that. 

   Tim Trautmann (timt at adapted the FAQ for www/Mosaic
   use (before I learned html). He's responsible for all the wonderful
   hypertext/ftp links. Great work Tim! (I'm afraid my hurried edits to
   get this revision out have not been perfect, and the FAQ's formatting is
   a little messed up--this is entirely my fault due to my haste: timt's
   formatting was perfect...) 

   This FAQ is not perfect, in fact, it's not even pretty. During my 18
   months doing linkage analysis work, I searched the net trying to find
   stuff, and used up a bunch of time. This FAQ is sufficiently
   disorganized that it may take you half-a-day to sort through it, but I
   hope that will save you some time.

   On a personal note, I'm continuing my career as a system
   administrator, and am no longer doing genetic linkage analysis. If I
   have time, I'll incorporate corrections/additions that people email me
   (rootd at, but I'm not actively searching/editing the faq. In
   addition, someone who is doing linkage analysis would almost
   certainly do a better job (assuming they have the time :-). For this
   reason, I'm placing this FAQ in the public domain so anyone who
   wants to take over editing it can do so without restriction. If you have
   the time, and want to be a FAQ maintainer, send me some email. 

   My eternal thanks to those who sent me information. My repeated
   apologies for not updating the FAQ for six months. 

   What other people contributed to this FAQ? [rootd;21may94] 
      Matthias Wjst sent us tons of useful material 
      David Kikuchi pointed out the genbank gopher sites 
      Pierre Janssens forwarded me some usenet answers, and
      described Cyrillic. 
      Bennett Dyke provided information on his version of peddraw. 
      Michael Boehnke supplied a postal address to obtain simlink. 
      Don Bowden gave us a lead in finding a .gen->linkage
      Young B Choi posted a list of journals to the net. 
      Robert Stodola sent us info about the chlc. 
      Ellen Wijsman gave a nice answer to the allele frequency
      question on the net. 
      Jurg Ott sent us tons of corrections, clarifications, and new
      Peter Doris helped identify a problem with our ftp site. 
      David Adler told us about the idiograms at the University of
      Tim Littlejohn posted a gopher site with conference schedules. 
      John Attwood put his ceph2cri program on the net. 
      Rob Harper posted how people can use telnet to access the
      David Featherstone sent info about fastlink on SGI's, and on
      Dave Curtis posted about DOLINK, the automatic recoder. 
      Kim Worley posted a web site. 
      Mike Miller sent me some info on LABMAN and LINKMAN. 
      Tara Matise sent me 18 separate pieces of information! Thanks! 
      Eli Meir posted about the Genetics Construction Kit (fly
      genetics simulator). 

   I'm afraid some other people sent me stuff, some of which was
   included, and some of which was lost (been a hectic half-year). My
   apologies. Feel free to send me some nasty email (or a correction, or to
   claim credit for something. 

   How can I help improve this FAQ? [rootd;19may94]

   Think back to the old times. What do you understand now, that you
   didn't understand then? What lack of knowledge caused you to waste
   the most time? What information would have helped you become
   productive more quickly? Share your hard-earned lessons with others! 

   There are a couple areas where I'd like to specifically request
    1. Internet resources: there are tons of ftp/gopher/www sites out
      there. Nobody knows them all. Help me compile a complete
      list. Send me the site addresses and a brief description of what's
    2. File format conversion programs: I want programs to convert
      between the diferent file formats (crimap's .gen, ped.out,
      linkage, simlink, peddraw (mac) liped etc...) I'd like to compile
      a "complete-set" of file conversion programs. I particularly
      want source for Santosh Mishra's mkcrigen (ped.out -> .gen)
    3. An ftp site for crimap, simlink, mkcrigen, and the crimap
      utilities package 
    4. Programs for manipulation, analysis, and comparison of .gen
    5. I'd like plenty of "linkage-101" and "crimap-101" questions.
      What did you waste most of your time on? 
    6. If somebody wants to formally specify some of the file formats,
      and give a small example (or two) for each, I'd appreciate it. 

   What type of information will never be contained in this FAQ?

   Conference schedules/information (too volatile for a FAQ, let the
   journals handle it...but there's a nice gopher site in our gopher section

   I sent you some information, and you either: didn't include it, or didn't
   give me credit. What can I do? [rootd;29may94] 

   Oops. My mistake. I tried to keep a list of everyone and their
   contribution, but didn't completely succeed (translation: I failed). My
   apologies. Send me email and I will make appropriate corrections... 


   What anonymous-ftp sites have programs/utilities useful for genetic
   linkage analysis? [rootd;29may94] keeps a UNIX version of LINKAGE
      (Lathrop/Lalouel/Julier/Ott). They also keep have the PC
      version, but it doesn't appear to have been updated since
      July-1991. has the PC and VMS versions of
      LINKAGE, and also other programs such as HOMOG, LIPED,
      SLINK, and some programs from Dr. Newton Morton (LDB,
      MAP-LODS, POINTER). In addition, all Linkage Newsletters
      are kept online. has FASTLINK, the optimized C versions of
      linkage(5.1) which continue to undergo massive improvements. has some stuff in
      /non-gdb-data/NIH-CEPH-data/CEPH-DATA/src, including
      possibly the CRI-MAP utility programs (by Todd Steinbrueck
      of Helen Donis-Keller's lab). has Multimap, a lisp-based expert
      system for automated construction of genetic linkage maps
      using the CRI-MAP program. has some stuff from Dan Weeks,
      including his APM programs and SLINK. Here's the info he
      sent me: 

      At, you'll find the following
      files in the pub directory after logging in via
      anonymous ftp:

      newapm.tar.Z contains the package of programs
      for the Affected Pedigree Member (APM) Method of
      Linkage Analysis.
      slink.tar.Z contains the SLINK package of
      programs for simulation of genetic data.
      cintmax.tar.Z contains a modified version
      of CILINK which permits the usage of
      different map functions in computing the
      simapm.tar.Z contains the SLINK-based
      simulation program for the APM package.
      This represents a hacked together package
      which only runs under a Unix system.  You
      will need FORTRAN, Pascal, and C compilers
      to use this package. has some useful IBM programs, including: 
         peddraw (a DOS pedigree drawing
         program--completely different from the B. Dyke
         MacIntosh peddraw 4.x) 
         fastmap produces a quick approxomation to multipoint
         lod scores 
         dolink A DOS genetic database/analysis-setup program 
         easistat A simple DOS statistics package 
         easigraf Draws graphs of lod scores has the above IBM programs, as well as the
      ceph2cri program from John Attwood. ceph2cri reads your
      ped.out file and creates a crimap .gen file for you. is the Cooperative Human Linkage Center's ftp
      site. is the home of GNU (the free software
      foundation) which produces free software (such as the gcc
      compiler, and the emacs editor). is the largest anonymous ftp-site on the
      planet. They have the whole GNU/free software foundation
      distribution, and tons of other stuff. has all the files for OMIM (online mendelian
      inheritance in man) and GDB (genome-data-base). Searching
      within the search program is much easier. has telnet, gopher, and mosaic clients for
      many different types of computers. Ever wonder where "ncsa
      telnet" was from? This is it. in /pub/users/cat/rootd is where I put the latest
      FAQ version, my linkage->peddraw sed/awk script, and any
      other stuff that program authors decide to let me put on my ftp
   NOTE: crimap and simlink are not currently available from
   anonymous ftp sites. 

   There are many more sites with useful stuff. Email information to
   rootd at and I will add them to this list. 

   I think I know the name of a program I want, but I don't know where I
   can find it. [rootd;21may94] 

   There is a database program called archie, which maintains a list of all
   files in registered anonymous-ftp sites. You can telnet to an archie
   server, and have it search the database. Each site is updated every 30
   days, so very recently posted programs might not be listed yet. 

   To use archie, you need to telnet to one of the archie server sites, which

   (thanks to O'Reilly's Internet book for this list) 

   Use the login name "archie" and nothing as your password. Here is a
   simple archie login an search: 

   bigbox% telnet
   login: archie
   password:       <--just hit return, not like anonomous-ftp

   unl-archie> find linkmap
   # Search type: sub.
   # Your queue position: 2
   # Estimated time for completion: 00:24
   working... -

   Host    (
   Last updated 21:04  9 Apr 1994

   Location: /contrib/src/pa/m3-2.07/src/driver/boot-DS3100
   FILE    -rw-r--r--    4000 bytes  23:00  2 Jun 1992  M3LinkMap_i.c
   FILE    -rw-r--r--   14027 bytes  23:00  2 Jun 1992  M3LinkMap_m.c

   Location: /contrib/src/pa/m3-2.07/src/driver/linker/src
   FILE    -rw-r--r--    1307 bytes  00:00  4 Dec 1991  M3LinkMap.i3
   FILE    -rw-r--r--    3078 bytes  00:00  4 Dec 1991  M3LinkMap.m3


   Unfortunately, these linkmap programs have nothing to do with
   Lathrop and Ott's linkage package. Most gene-linkage programs are
   not on archie-registered ftp sites. 

   I have an ftp site with gene-linkage programs/utilities on it. How do I
   get registered with the archie servers? [rootd;15may94] 

   send email to archie-admin at with the domain-name of
   the ftp site and the email address of the administrator. If you are the
   administrator of the ftp-site identify yourself as such. 

   What gopher sites have useful genetic-linkage information?
   [rootd;21may94] has background information on the human
      genome project, and archives of the "Human Genome News"
      newsletter. is also a gopher site which can access
      genbank It also has a link to the genethon gopher site. is the genethon gopher site. is the National Institute of Health gopher. It can
      access genbank, as well as other stuff. is also a gopher site which can access genbank has all information released by the cooperative
      human linkage center. 70 (I think that's a port
      number) has human and mouse standard idiograms. The
      idiograms are useful for making illustrations for gene mapping,
      i.e. physical, and for constructing abnormal chromosome
      illustrations, like translocations, deletions, etc. The PostScript
      versions produce high quality output - can be sent to lino for
      publication figures. The PostScript idiograms can be
      manipulated band-by-band with illustration software such as
      Adobe Illustrator, Aldus FreeHand, Canvas, Altsys Virtuoso,
      etc. has information on conferences,
      and other stuff in: 

      -->  5.  Computational Molecular Biology- programs, documents, help/
      -->  14. Upcoming-Conferences/

   What books are helpful when learning about genetic linkage analysis?

   Jurg Ott's Analysis of Human Genetic Linkage is THE work in this
   area, It is available from Johns Hopkins University Press ($47.50) 

   J.D. Terwilliger & J. Ott, "Handbook of Human Genetic Linkage,"
   Johns Hopkins University Press, 1994, $60. It grew out of the handouts
   for the linkage courses and provides detailed instructions on how to use
   the LINKAGE (and some other programs) on a PC. 

   Guide to Human Genome Computing, edited by Martin J. Bishop, and
   published by Academic Press (1994). It is very internet-oriented. The
   first chapter talks about ftp sites, etc. and Chapter 3 is dedicated to
   linkage analysis.($40) 

   E.A. Thompson: "Pedigree Analysis in Human Genetics", Johns
   Hopkins University Press, Baltimore and London, 1986 ($35). 

   K.E. Davies (editor): "Human Genetic Diseases - A Practical
   Approach". IRL Press, Oxford England and Washington, D.C., 1986
   ($25, softbound; $40, hardbound). 

   Muin J Khoury, Terri H Beaty, Bernice H Cohen. Fundamentals of
   Genetic epidemiology. Oxford University Press 1993, Monographs in
   epidemiology and biostatistics, Volume 19. "A good introductory book
   with 339 pages (att:several mistakes)" 

   Please send me other suggestions. 

   What genetic-linkage databases are available on the internet?

   medline is a database for searching for articles in journals. If your site
   is a member of NorthWestNet, you can get to medline using telnet. Just
   telnet to and go into the library databases. It can
   even email you the output if you wish! Many libraries and many
   internet service providers have medline services online. Some
   interfaces are better than others (we don't even bother using the one at
   OHSU--it's too painful...) Your local library can probably supply you
   with information. 

   [cgochiku;2Aug94] posted this:

   For those of you out there with Macs who use MEDLINE and would
   like a way to put those text files of downloaded references into a
   database, check out medline-hc.sit in the Stanford archives. It is a
   hypercard stack I wrote that allows fast importing of references,
   including the abstracts. The file is at

   Victor McKusick wrote a book: Mendelian Inheritance in Man. It is
   continuously updated online at Johns-Hopkins University (making it
   online-MIM or OMIM). Combined with the Genome- Data-Base, it
   is available via ftp at You need to get an account. Send
   email to help at for information. After you get an account, the
   telnet address is The GDB www address is,
   which has a useful but restricted version of GDB available. 

   Here's an old workshop announcement that might be useful:


                 Organised by The Biocomputing Centre at DKFZ 
                       Heidelberg 13-14th October 1994

   The Integrated Genomic Database (IGD) is an international project to
   develop an information management system for human genome researchers
   which interconnects
   existing molecular biology databases and analysis tools. 

   IGD is designed as a network system based on a client/server architecture.
   regard to the origin and scope of data, the system can be subdivided into
   levels: 1) resource databases which contribute data 2) target database
   which manage the integrated data 3) front-end clients which manage data
   to the user.

   Users need to install the IGD front-end on a local workstation for
   with the IGD system. The most important parts of the front-end are the
   database manager and the interfaces to communication and analysis. Users
   query the IGD servers and download the resulting data into their local
   database,where it can be manipulated and analysed. Private data and
   analysis results may
   also be deposited into the local database.

   Registration of the workshop:  12th October, 18.00-20.00

   For further information and details of accommodation please contact:

           Mrs. Anke Retzmann
           Dept. of Molecular Biophysics
           Im Neuenheimer Feld 280
           69120 Heidelberg
           Tel.: +49-6221-422372
           Fax.: +49-6221-422333
           E-mail: a.retzmann at

   What is WWW? [rootd;16may94] 

   WWW stands for world-wide-web. People set up www servers
   (similar to anonymous ftp servers) that you can browse through. The
   webspinners (people who set up web sites) include "links" to other
   related sites. All you have to do is click a mouse-button on the link,
   and you will immediately go to the other site. The CEPH www site,
   for example, has a link to the genethon www site. This makes it very
   easy for you to get related information. My favorite www site has the
   before-repair and after-repair Hubble telescope pictures side-by-side.

   What is Mosaic? [rootd;16may94] 

   Written by NCSA (the National Center for Supercomputing
   Applications) this program lets you look through www sites. It can
   spawn viewers to look at graphical data, output sound data on your
   computer's speaker (if your computer has a speaker), save your
   "favorite" www sites between sessions, and access automated
   www-search-engines (which search the www for you--similar to

   What is Lynx? [rootd;19nov94] 

   Lynx is another world-wide-web browser (like Mosaic). Lynx,

More information about the Gen-link mailing list