Large phrap assemblies

Rachel Meredith Kadelk-Garcia rgarcia at fas.harvard.edu
Thu Feb 17 15:39:13 EST 2000


In article <88ggb8$pgt$1 at mercury.hgmp.mrc.ac.uk>, Paul Shinn wrote:
>    I can't seem to find the proper forum for this question so I'll ask here.
>If you have someone in your lab who can answer the question, please pass 
>it on.
>    We have 100-120kb sequencing projects that can exceed 5000 
>sequences.  When this happens, phrap runs out of memory and we end up 
>finding alternative, more tedious ways to assemble the data.  Our fastest 
>machine a Sparc Ultra10, 450Mhz with 384MB RAM doesn't even cut the 
>mustard.  There is a departmental machine with 1Gig of RAM that can't 
>even do it sometimes.  

This strikes me as odd.  Have you tried doing an unlimit stacksize and
unlimit datasize?  What does top say when the phrap is running?  

We have had occasional problems with mid-size datasets like yours when
there were low-complexity regions that caused too many matches to be
nucleated; I *think* we solved that by increasing minmatch, but I'm not
certain.

(When you say "over 5000" -- your datasets never get bigger than 64000
reads, right?  If they did, you'd need to recompile phrap with MANYREADS
defined.)

						Rachel
---






More information about the Autoseq mailing list