Wondering about software to treat the Double-Digest Problem

Dan Jacobson x-8453 danj at WELCHGATE.WELCH.JHU.EDU
Thu May 14 18:15:55 EST 1992


Mark Reboul writes:

>Greetings! Can somebody out there in Bionetland tell me if publicly 
>accessible software exists for handling Double- or Multiple-Digest 
>Problems?
>
>I am aware that these are difficult computational problems, pretty 
>much intractable, for a number of reasons. However, I have a user 
>facing fairly "small" instances of DDP/MDP, and he could be aided by 
>some automated form of attack, accepting that suggested solutions 
>might come with no guarantee of correctness or optimality.
>
>We would also be interested in hearing about public-domain software 
>which could assist the planning of successive digests for purposes 
>of restriction mapping, that is, a tool which could compute and 
>somehow highlight remaining ambiguities in restriction site order.
>
>Note: I am not a subscriber to this group so please reply to me 
>directly. Thanks in advance for any help!

Hmm, out of principle I tend not to respond to people who say "I don't
read this group (and therefore never CONTRIBUTE to this group) but
please do some work for me and send me information on .....  I find
this to be a bit rude - if you're asking for information the LEAST
you can do is to follow the group for a few days to get your response!
IMHO we all learn from the questions and ANSWERS of others.  (Steps off
soapbox)

However this is information which may be of general utility so I am posting
a reply and sending Reboul his own personal copy :-)

I have included the documentation for three programs which you may
find usefull for at least some of your requested tasks.  The comap doc
has been truncated severely to make it a reasonable size for posting.

Kudos go to the authors: Roy Smith (Renzyme), Stefan Rensing (Marker)
and Kay Hoffman (Comap).  Kudos also go to the ftp site maintainers:
Don Gilbert, Dan Davison and Rob Harper.  While I'm throwing out Kudos - 
major kudos go to David Kristofferson for keeping these newsgroups as
living, breathing entities!

Best of luck,

Dan Jacobson

danj at welchgate.welch.jhu.edu

==============================================================================

RENZYME


	Renzyme is a restriction mapping program based on a fast finite
state machine pattern matcher.  In general, the rate at which it finds
restriction sites is only limited by how fast it can read the sequence file
in and print out the results.  See README.RENZYME for more details.

	RenzData and RenzRefs are Rich Roberts' offical list of restriction
enzymes.  These files get updated periodically, whenever RR mails out new
versions of his list.  The file rob2renz is a Unix awk/sed script which
will take the RenzData file and turn it into the format renzyme expects.
Just do "sh rob2renz < RenzData > renztab".

	To get renzyme running, you need to get renzyme.tar.Z and
seqlib.tar.Z, both of which are tar files compressed using the popular
Lempel/Ziv-based compress program, and need to be transfered in binary
mode.  You can find seqlib.tar.Z in the same directory which contains
renzyme.tar.Z.
 
    Make a clean directory, untar them both there, and you should be all
set.  Seqlib.tar.Z contains some top-level configuration and instruction
files you will need to get started.  I hope this is all self-explanitory.
Please drop me a note when you get this just so I know who has it and so I
can maintain a mailing list of interested people.  Happy renzyming!
	Keep in mind the following license agreement.  By accepting these
files, you agree to obey the following straight-forward rules:

	You may make no commercial use of this software without first
receiving explicit written permission from the Public Health Research
Institute.  If you are a not-for-profit, academic organization, using this
program for research purposes, you may redistribute this program to other
similar organizations, but not otherwise, and only if you redistribute all
the original files, including this note and the Copyright file (you may
include additions you have made, but make sure people get the originals as
well) and do not charge for the redistribution.  Note that all the source
code is copyrighted.  If you have picked this up via ftp or gotten it from
somebody other than me, you should let me know that you have it via either
electronic or paper mail, to the address below.

	You should cite my paper, ("A finite state machine algorithm for
finding restriction sites and other pattern matching applications", CABIOS,
Vol 4, no. 4, 1988) in any published report on research conducted using this
program.

Host fly.bio.indiana.edu

    Location: /molbio
      DIRECTORY drwxr-xr-x        512  May 14 1991  renzyme
    Location: /molbio/renzyme
           FILE -rw-r--r--       3754  May 14 1991  renzyme.readme
           FILE -rw-r--r--     205312  May 14 1991  renzyme.tar_z

Host goober.phri.nyu.edu

    Location: /pub/seq
           FILE -rw-r--r--       1891  Aug 10 1989  README.RENZYME
           FILE -rw-r--r--     204835  Aug  1 1989  renzyme.tar.Z


=============================================================================
=============================================================================


DOS only


MARKER V1.0 - DOCUMENTATION


Copyright 1991 by Stefan Rensing.


General description :



MARKER is a program for creating DNA length standards and restriction frag-

ments pattern. You may either use it to create some new length standards

(from DNA you already have) or to check out the cleavage sites / digestion

pattern of any sequences.

There are some other programs which do some or all of the things that MARKER

can do. But they are much less comfortable. You'll see.



MARKER V1.0 is FREEWARE. You are not allowed to remove the Copyright infor-

mation. I'll trie to make updates every 3 to 6 months.







Using the program :

(Short description)



  - Copy the required files into the same directory

    (required files are : MARKER.EXE and ENZYME.DAT)

  - Start program from that directory by typing marker <RETURN>

  - Change to the directory where your file is saved

    (using the first five menu options)

  - ENTER filename

  - LOAD file (be sure that the filespec is correct first)

  - CHOOSE restriction enzyme(s)

  - DIGEST

  - MAKE (and print) restriction map

  - SHOW restriction fragments pattern





! Before printing, be sure that your printer is ready.



! If you like to use the option "Show restriction fragments pattern" you

! must have a graphics card installed with a resolution of min. 640x348

! pixels and the correct TURBO 6.0 graphdriver (.BGI) copied into the start

! directory.



! If you like to print the pattern, make a hardcopy (for example with the

! DOS-command GRAPHICS.COM, see handbook).



Restrictions :



- Every line of a file must not be longer than 80 characters.



- The whole file must not consist of more than 510 lines.



- The sequence must not be longer than 40000 bases.





File specifications :



MARKER V1.0 supports 5 different file specifications (filespec) :


ASCII  : A simple "text only" file which simply contains the bases of the

sequence. Blanks and returns are allowed but nothing else.

MARKER : Same as ASCII, but the file may contain an ID information

(see below). The sequence will be read between SQ (or SEQ) and EOF.


EMBL   : The format which is used in the EMBL.NUC Database. The program

will take the sequence between SQ (or SEQ) and // (or END).


PCGENE : The format which is used by PCGENE (TM of IntelliGenetics Corp.).

The program will take the sequence between SQ and //.



RESMAP : A special form of data, the file contains a restriction map

instead of a sequence.

Format :                           e.g.

         ID identification;        ID T7;

         LE length of sequence     LE 39936

         EN enzymename;            EN BstNI;

         restriction site(s)       2366

                                   8188



Be sure to put semicolons at the right places.

You may use up to 5 restriction enzymes for the map.



Generally, the program will take the word behind ID as the identification

(remark) of the sequence, except in the case of ASCII.

Each filespec has its own search pattern for listing directories, but this

will work only then if you name the files correctly. However, you can load a

file even if it has a "wrong" name (except in the case of RESMAP).




The ENZYME.DAT file :


This file contains the information about the restriction enzymes.

The file enzyme.dat must exist in the directory where you start the program

from. If you like to change the file, feel free (use any editor).

For the description of multiple recognition sites the IUPAC ambiguity code

has been used, except Pu (= Purine, A or G) and Py (= Pyrimidine, T or C).



FORMAT :

Not more than 60 enzymes, alphabetical order, name and site not more than 10

characters each.

The cleavage site must be marked with a slash (/), backslash (\) or vertical

line (|) in the restriction site. Meaning :  / = 3'-extension, \ = 5'-exten-

sion, | = blunt ends.

The following 5 lines after name and site contain the different recognition

possibilities (if there are more than one, if not, simply write "none"), e.g. :

HindII

GTPy|PuAC

GTTGAC

GTTAAC

GTCGAC

GTCAAC

none


(you can only use A,C,G and T in the recognition lines).



Using RESMAP-files :


After loading a restriction map choose "MAKE restriction map" and then "SHOW

restriction fragments pattern".

In the case of filespec RESMAP the options "CHOOSE restriction enzymes",

"DIGEST" and "Show current sequence" are not available/necessary.



Examples :

There are two files in the package which may be examples :

T7.MAR : The whole sequence of bacteriophage T7 (39936 bp) as a MARKER-

file and T7HPA.RES : A RESMAP-file of T7 DNA digested with Hpa I.





Host ftp.bio.indiana.edu

Location: /molbio/ibmpc
        FILE       90476 marker.uue

===============================================================================
===============================================================================


Comap



Computer aided mapping of small

DNA-fragments



V 1.0    October 1991



A. What is COMAP



COMAP is a public-domain program for helping with the construction of 

restriction maps of small DNA fragments from digestion data. Comap is 

graphically orientated, the user normally doesn't work with numerical 

restriction fragment sizes but co



More information about the Bio-soft mailing list