EST clustering and assembling
Phillip San Miguel
pmiguel at purdue.edu
Tue Jan 29 08:26:10 EST 2002
Andrea Hansen wrote:
> Hi,
>
> is there anybody with experience in clustering and assembling of
> EST data? What do you think about CAP3?
> How many sequences can I assemble with this program?
> [...]
I use phrap and CAP3. The problem with phrap (www.phrap.org) is that
it will typically create nonsense contigs where every read in the
contig has 75% of its bases disagreeing with the consensus. I
typically use the following parameters:
phredPhrap -shatter_greedy -penalty -9 -minscore 50
and they help somewhat. But generally I end up with at least a few
nonsense contigs. So I use the .fasta.screen and .fasta.screen.qual
files generated by phredPhrap to do a CAP3 assembly. I've assembled
1000's of ESTs this way. 10,000's should work too. CAP3 is slower
than phrap. It might take 24 hrs to run with 7000 ESTs--whereas
phrap will only take a couple of hours. This is on a Sun E450 server
with 4 GB of RAM.
Phillip SanMiguel
Purdue Genomics Core Facility
More information about the Methods
mailing list