info fragment assembly

Jared Roach roach at u.washington.edu
Wed Nov 23 02:15:57 EST 1994


	I recently attended the DIMACS Workshop on Combinatorial
Methods for DNA Mapping and Sequencing.  Many of the protagonists
in the field of DNA fragment assembly were there.  In fact, this
year's "open problem" is a challenge to find the best fragment assembly
program.  Information on this challenge can be found at

ftp   dimacs.rutgers.edu    /pub/challenge4

I also recommend contacting
Gene Meyers   gene at cs.arizona.edu
R. Idury  (Dept of Math and/or Mol Biol; U Southern Cal; email?)
Rebecca Parsons	rebecca at cs.ucf.edu

	These people are all working on assembly programs and
can point in the right general directions for this sort of thing.
Many others are working in this area, too.

	A few comments on fragment assembly.  Greedy algorithms
seem to work pretty well on real biological data.  The best method
for repeat resolution is probably the way it is done in practice:
Take advantage of the fact that repeats are not 100% identical.  This
implies that incorrectly joined fragments can be identified by the
presence of inconsistencies across at least three sequences at at least 
two different sites.  If a program is to be useful, it must be
user-friendly. In general, this means allowing a user to override
decisions and to edit intermediate data structures. 

	Also, there is currently a rise in the number of sequencing
projects that use pairwise end sequencing (the double-barrel shotgun),
of which I am a strong proponent.  It is my expectation that competitive
assembly programs in the near future will be those that can take advantage
of this information, or at least allow the user to input it.  Hershel
Safer (hersh at cric.com) of Genome Therapeutics has such a program already.

Jared Roach
Molecular Biotechnology
University of Washington
roach at u.washington.edu



More information about the Biochrom mailing list