I'm not much of a molecular biologist, so here goes a naive question.
Apparently, when trying to clone something, one may be uncertain as to
which reading frame to interpret a partial sequence in; also,
sometimes one would like to know whether one is in an intron.
The question is, has anyone tried to use codon frequencies to infer
which is the right reading frame, and whether it is a coding region?
If so, how many base pairs are required (estimated) to get a
statistically significant sample.
Thanks,
Tony Zador
zador at yale.edu