some utilities

cgr at poplar1.cfr.washington.edu cgr at poplar1.cfr.washington.edu
Thu Jul 22 19:09:42 EST 1993


i have written a couple of programs that help me move data from mapmaker
to a relational database, from whence i will be able to move it to acedb.
these may be re-inventing the wheel, but when i asked around a few weeks
ago no-one knew of programs like these.  (please let me know if you have
a better mousetrap!)

they are now available for anonymous ftp from poplar1.cfr.washington.edu
in the directory /pub/acedb in the file mm2db.tar.gz.  this file is in
gnu's gzip format.  this file unzips/untars into a directory called 
mm2db.  look at the file README.mm2db for information and instructions.
sources and solaris 2.1 binaries are included.

i hope these are helpful.

herewith is the README.mm2db file:

                 MAPMAKER - TO - DATABASE CONVERSION PROGRAMS

The associated programs, scanmap and scanseq, help translate data files
generated from MapMaker 3 into a format that can be understood by a relational
database management system.

The data output by these programs can added to an existing database (what is
done here), or used to create a new database of genetic map information.  One
could then, for example, combine the MapMaker data with existing an existing
database and generate a report of the combined data for inclusion in an acedb
database.

These programs were developed on a Sun SPARCstation LX running under Solaris
2.1 (Unix System V Release 4), using GNU's gcc compiler.  As such, the
executables initially distributed run only on SPARCstations under Solaris 2.1.
The source code is included so that the programs can be compiled for other
machines.  The output file names will have to be changed to work under MS-DOS.

There are a couple of things that I may change in the future:
	- include a date-generated field in all output records
	- put in more error-recovery code
Please let me know if you have other things to add, or make local
changes--others may want the same things!

cgr	July 1993

carl g. riches
college of forest resources	    internet: cgr at poplar1.cfr.washington.edu
university of washington ar-10	    voice:    206-543-2764
seattle, wa 98195		    fax:      206-543-3254


SCANMAP

usage:	scanmap <mapmaker map file>

scanmap is a program that reads through a MapMaker .map file looking for
information about mapped loci and chromosomes.  It writes out three files: a
file of locus data, a file of chromosome information, and a file of map data
, one record for each

The format of the locus data file is (one record per locus): 
	- the locus name
	- MapMaker's internal id number for the locus
	- MapMaker's code about whether or not the locus is mapped
	- which chromosome on which it is mapped

The format of the chromosome data file is (one record per chromosome): 
	- the chromosome name
	- MapMaker's internal id number for the chromosome
	- the number of loci mapped on the chromosome

The format of the map data file is (one record per chromosome-locus pair): 
	- the chromosome name
	- MapMaker's internal id number for the chromosome
	- the locus name
	- MapMaker's internal id number for the locus
	- the raw distance to the next locus as calculated by MapMaker
	- the distance (in cM) to the next locus using the Haldane method
	- the distance (in cM) to the previous locus using the Haldata method
	- the distance to the next locus (in cM) using the Kosambi method
	- the distance to the previous locus (cM) using the Kosambi method
	- the distance from the first mapped locus on this chromosome (cM) 
	  using the Haldane method
	- and the distance from the first mapped locus on this chromosome (cM) 
	  using the Kosambi method.

Remember to link in your math library when compiling this file.

SCANSEQ

usage:	scanmap <mapmaker data file>

scanseq reads through MapMaker's .data file for the sequences at each locus. 
It then writes out a file of this data, one record per line: the locus name, 
and the sequence.

GENEMAP.CMD

usage:	(depends on your dbms)

genemap.cmd is a set of SQL commands that can be used as a model for creating a
relational database to hold the data output by scanseq and scanmap.  This code
was generated by R:Base 3.1c running under MS-DOS.  It may need to be changed
for other SQL database management systems.  (Of course, major modifications
will be necessary for non-SQL dbms's :)

---
carl g. riches
college of forest resources	    internet: cgr at poplar1.cfr.washington.edu
university of washington ar-10	    voice:    206-543-2764
seattle, wa 98195		    fax:      206-543-3254




More information about the Acedb mailing list