(none)

arnold at BSCR.UGA.EDU arnold at BSCR.UGA.EDU
Sun Aug 1 14:42:48 EST 1993


TO: Those Interested
FROM: Jonathan Arnold, ARNOLD at BSCF.UGA.EDU
SUBJECT: PCAP

                        DISTRIBUTION INFORMATION ON
                   Probe Choice & Analysis Package (PCAP)

                                 ver 1.5

                         (c) 1990 The Univ of GA &
                            A. Jamie Cuticchia

PROGRAM DESCRIPTION:

        Programs are now available to assist in choosing synthetic
oligonucleotide probes for contig mapping:

        Cuticchia, A.J., J. Arnold, and W.E. Timberlake (1993).
        PCAP: probe choice and analysis package - set of
        programs to aid in choosing synthetic oligomers for
        contig mapping. CABIOS 9: 201-203

        PCAP is a suite of programs that: (i) convert Genbank files in
        to a format useable by the package; (ii) calculate
        trinucleotide and tetranucleotide frquencies in available genomic
        sequence on a particular species; (iii) present the user with upper
        and lower bounds on the frequencies of hybridization sites for 
        oligonucleotide probes of length 8-12; (iv) allow the user to place
        constraints on site frequency and G+C content of probes and provides
        a list of short probe sequences that fit these criteria. Predictions
        of a hybridization site's frequency of occurrence is based
        on fitting a third-order Markov chain model to the genomic
        sequence data presented to the program.

        Any published use of these programs should cite the reference above.



PROGRAMS:

        Programs are written and FORTRAN-77, and PCAP.FOR utilizes VMS screen
        management utilities under VT100 terminal emulation.

        PCAP.COM   -  VAX/VMS command file to compile and link PCAP programs.
        PCAP.FOR   -  manages menuing system for VT100 terminal emulation.
        FORMAT.FOR -  formats GenBank data for use by PCAP.
        MARKOV.FOR -  fits third-order markov chain model to formatted sequence.
        HILO.FOR   -  estimates High (Low) spacings (in bp) between 8mers-12mers.
        PICK.FOR   -  picks oligomers with desired spacing (in bp).
        LEAVE.FOR  -  closes down program suite, when requested.

PROGRAM INPUT LIMITATIONS:

FORMAT.FOR (GenBank-->Markov menu selection when you 'RUN PCAP'):

        The program is dimensioned to handle no more than
        1,000,000 bp of genomic sequence.

        Filenames (with directory path, if specified) must
        be no longer than 20 characters.

        The sequence file must be in the format of
        a GenBank entry from Intelligenetics.

MARKOV.FOR (Fit Markov Chain menu selection when you 'RUN PCAP'):

        The program MARKOV.FOR is dimensioned to handle no more than
        1,500,000 bp of genomic sequence.

        Filenames (with directory path, if specified) must be
        no longer than 50 characters.

HILO.FOR   (Hi-Lo Values menu selection when you 'RUN PCAP'):

        The program accepts as input two files named TRI.DAT
        and TET.DAT (both are generated by MARKOV.FOR). 
        The format is simple in that TET.DAT,
        for example, lists tetramer frequencies from TTTT to
        GGGG in T-C-A-G-BLANK order. The user is not
        asked for any input.

PICK.FOR   (Choose Oligos menu selection when you 'RUN PCAP'):

        The program accepts a nonzero integer between 0
        and 2147483647 as the minimum spacing in bp between
        occurrences of a specific oligomer. This is
        the only requested input. This spacing is selected
        by the user on the basis of the Hi-Lo values
        obtained by running HILO.FOR. (Try 79500 for the example
        files).

        The program accepts a nonzero integer between 0
        and 2147483647 as the maximum spacing in bp between
        occurrences of a specific oligomer. This is
        the only requested input. This spacing is selected
        by the user on the basis of the Hi-Lo values
        obtained by running HILO.FOR (Try 80500 for the example
        files).

        The program accepts an integer representing the minimum
        G+C content of an oligomer in bp. (Try 7).

        The program accepts an integer representing the maximum
        G+C content of an oligomer in bp. (Try 7 again).


        The program uses as input two files named TRI.DAT
        and TET.DAT (both are generated by MARKOV.FOR). 
        The format is simple in that TET.DAT,
        for example, lists tetramer frequencies from TTTT to
        GGGG in T-C-A-G-BLANK order.

LEAVE.FOR (Choose exit menu selection when you 'RUN PCAP'):

        none

OUTPUT:

FORMAT.FOR (GenBank-->Markov menu selection when you 'RUN PCAP'):

        A formatted genomic sequence file with 80bp per line
        with the exception of the last line of the file and a
        single space delimiting different GenBank accessions.

MARKOV.FOR (Fit Markov Chain menu selection when you 'RUN PCAP'):

        Length of genomic sequence, frequencies of monomers,
        frequencies of dimers on the screen.

        A file TRI.DAT with frequencies of trimers from
        TTT to GGG listed in a T-C-A-G-Blank order.

        A file TET.DAT with frequencies of tetramers from
        TTTT to GGGG listed in a T-C-A-G-Blank order.

        Also files MONO.DAT, DI.DAT, PENT.DAT, and HEX.DAT
        are generated with monomer, dimer, pentamer,
        and hexamer frequencies listed in T-C-A-G-Blank order., 
        but are not used by the rest of the package. 


HILO.FOR   (Hi-Lo Values menu selection when you 'RUN PCAP'):

        The Hi (Lo) spacings in bp on the screen for 8mers-12mers.

PICK.FOR   (Choose Oligos menu selection when you 'RUN PCAP'):

        The numbers of 8mers through 12mers on the terminal
        screen selected by the program.

        The program creates output files 8MER.DAT,...,12MER.DAT
        listing the oligonucleotides satisfying the selected
        spacing constraint as well as their expected spacing
        in the genome and G+C content.

LEAVE.FOR (Choose exit menu selection when you 'RUN PCAP'):

        Indicates that the user is leaving the program.
 


OBTAINING THE SOFTWARE: 

        The software is only distributed via
        Internet using EMAIL. Please send an EMAIL request to:

                    ARNOLD at BSCF.UGA.EDU

        if you wish copies of the programs. I will EMAIL you:

        1) this documentation file, PCAP.DOC

        2) the FORTRAN programs (PCAP.FOR, FORMAT.FOR, MARKOV.FOR, HILO.FOR,
        PICK.FOR, LEAVE.FOR)

        3) test input files (ASPERGILLUS.SEQ, TRI.DAT, TET.DAT) 

        4) example output files (8MER.DAT, 9MER.DAT). The other -MER.DAT
        files were empty, like 8MER.DAT

        5) a command file, PCAP.COM, to build the package.

RUNNING THE SOFTWARE:

        This last file PCAP.COM is what you would use to build the package.
        On a VAX/VMS system you issue the following command at the
        dollar prompt:

$              @PCAP

        To run the package, type the following at the dollar prompt:

$              RUN PCAP

        The sequence of menu choices is likely to be:

                     GenBank-->Markov         
                     Fit Markov Chain
                     Hi-Lo Values
                     Oligos
                     exit

USING THE SOFTWARE WITHOUT THE PROGRAMS: 

        The programs also have been
        incorporated into a DNA sequence analysis package (Arnold et al., 1986),
        and can be accessed directly on the Biological Sequence/Structure
        Computational Facility (BS/SCF). Contact Dr. Weise for a guest account 
        at:
                    WEISE at BSCF.UGA.EDU

OBTAINING FURTHER DOCUMENTATION: 

        The best source of documentation
        is the paper by Cuticchia  et al. (1993). A reprint can be
        obtained by writing:

                    Dr. Jonathan Arnold
                    Genetics Department
                    University of Georgia
                    Athens, GA 30602
        or by emailing:
                    ARNOLD at BSCF.UGA.EDU or
                    ARNOLD at BSCR.UGA.EDU

SOFTWARE SUPPORT IN THE USE OF THE PROGRAMS: 

        If you have questions about
        the programs, please contact Dr. A. Jamie Cuticchia currently located
        at Johns Hopkins University:

                    JAMIE at WELCHGATE.WELCH.JHU.EDU

HARDWARE LIMITATIONS:

     The programs have been run with minor modification on varied VAXstations.



  . - - - - - - - - - - - Jonathan Arnold - - - - - - - - - - - - - - - .
  |                       Dept. of Genetics,                            |
  |                       University of Georgia                         |
  |                       Athens, Georgia 30602                         |
  | Phone: (706) 542-1449                                               |
  | messages:    (706) 542-8000                                         |
  | FAX:         (706) 542-3910                                   



More information about the Bio-soft mailing list