[Staden] Benchmarking reassembly/Loading fasta files
N.E.Whiteford at soton.ac.uk
Thu Jul 13 12:04:36 EST 2006
As part of my PhD project I'm working on a tool to benchmark reassembly
algorithms. To do this I'm planning on doing the following:
1. Taking a sequence file and breaking it into reads of a specified
length and during this process adding errors.
2. Reassembly these simulated reads with the reassembly programs
available in GAP4.
3. Align contigs of a useful size to the original sequence, note those
that align within a given edit distance.
4. Calculate the percentage of the sequence that is covered by contigs.
I have just completed the alignment with edit distance tool and am now
beginning the processes of benchmarking reassembly algorithms. Does
anybody have any thoughts or suggestions? I should say that my main
interest is short read reassembly.
Secondly, I'm having a problem with GAP4. It only seems to load
19 sequences from my fasta file. My fasta file looks like this:
However if I include any more than 19 sequences in my fasta file I
get the following error:
/home/new/A1.fasta (UNK) 'init: Unknown file type'
Is this a bug? Or I'm I doing something wrong?
Many Thanks for Reading,
More information about the Staden