Hello all,
thanks Don for setting up this list.
I'm just finishing my PhD in Lausanne, Switzerland, working with some EST data from ants (the closest high quality genomes are the Nasonia wasp and the Honeybee).
One issue with 454 data are homopolymer errors (AAAAAAA may become AAAAAA or AAAAAAAA according to the 454 basecaller). When in a coding sequence, something like that leads to frameshifts and thus bad protein models. It should be possible to correct for this kind of error (and insertions/deletions in EST data in general) by using alignments obtained from blastx against a database of good proteins.
Have any list members done this?
Kind regards,
Yannick
--------------------------------------------
yannick . wurm @ unil . ch
Ant Genomics, Ecology & Evolution @ Lausanne
http://www.unil.ch/dee/page28685_fr.html