Scripted Directed Assembly Problem-
major at genome.wi.mit.edu
Fri Mar 29 17:33:12 EST 2002
Hello Staden Group-
I'm having a problem getting a VERY large directed assembly to build
We use Staden.2000.0 currently.
I have 73,574 reads which comprise 27 contigs in an assembly. When I
run the directed assembly graphically from gap4, the gapDB is built, but
painfully slowly(I've never let it runt o completion on this large data
set). When I use a modified assemblye4
script(http://www-genome.wi.mit.edu/personal/major/assemble4), I get
Processing number 7995: G59P61559FC1.T0
Fri 29 Mar 14:27:21 2002 SYSMSG : No such file or directory 
Fri 29 Mar 14:27:21 2002 ERROR : invalid type 
Fri 29 Mar 14:27:21 2002 COMMENT: reading record 0
Fri 29 Mar 14:27:21 2002 FILE : gap-io.c:171
Gap4 has found an unrecoverable error - These are usually bugs.
Please email all bug reports to staden-package at mrc-lmb.cam.ac.uk.
/home/strontium/major/.lsbatch/1017427958.375492: 12171 Memory fault -
*Note* when run on assemblies with < 8000 reads, this builds a valid
gapDB with no problems.
When running the directed assembly via the gap4 GUI, I start gap4 with
-maxseq 2100000 -maxdb 100000, then create a new DB and start the
Directed assembly. I've let it work to read number 20,000 before
quitting the program.(very, very slow) Via this script, I can never get
it past read 8,000. Isn't 8,000 what the maxdb defaults to? How do I
set this to a larger number when using scripts to build gapdbs?
I've tried opening up gap4 with the appropriate maxdb/seq values, saving
an empty gapDB, then having the script use that DB as a starting
database, but I still get the failure at reads 8,000.
I've also tried to modify line 12 of the assemble4 script to use the
-maxdb and -maxseq flags, but this just causes the script to open to a
This should be a minor fix, but I've spent a few days failing to get
this working due to what seems a simple preference problem...
More information about the Staden