Translation of protein sequence to nucleotide sequence

Ewan Birney birney at
Tue Aug 25 17:54:33 EST 1998

On 11 Aug 1998, Michele Clamp wrote:

> mcbaet at MCBSGS1.IMCB.NUS.EDU.SG (Anthony Ting) writes:
> > Hi there,
> > 
> >     Is anyone aware of a program that translate a protein sequence into
> > a nucleotide sequence given a specific codon usage table?
> > 
> The only thing I can think of is GCG's backtranslate which will give
> you the most and least probable DNA sequence.
> On a similar note and while I'm thinking of it does anyone out there
> know of something that will compare 2 protein sequences at the DNA
> level and give you the most likely DNA alignment.  I'd find this
> useful for finding possible undetected frame shift errors in multiple
> alignments.
> Thinking about it this is trickier than I first imagined and not just 
> a quick perl hack. 

Hmmm. I should i have responded to this post first! 

To detect frame shifts  in multiple alignments I would be tempted
to do the

make hmm-> search against underlying dna sequence with frameshift tolerant

BUT I guess you want something done with only the protein sequences
known, not the dna stuff... and so...

well, you can write the algorithm. Probably wouldn't take that long 
(especially using something like... dynamite for the dynamic programming

Are there going to be many cases where you know the protein and don't
know the dna sequence. Maybe... Maybe I should just knock it up in

(an annoyance is that you would want to drop the sequence you are testing
out of the MA before you built the HMM, making it a bit more icky)

Ewan Birney
<birney at>

More information about the Bio-soft mailing list