Genetic analysis program

tom chappell t.chappell at
Thu Jul 23 08:02:25 EST 1998

> Aaron J Beverley (ajbever at wrote:
> : I am considering writing a program which takes an arbitrary length of
> : DNA or RNA and will return possible genes from these genomic sequences.
> : I hope it will be able to determine the most probable ORF's, introns,
> : exons and possibly determine firstly a primary protein sequence then
> : secondary sequence.
> : I'm not sure if this sort of program could be useful for anyone, or if I
> : am being a bit ambitious, this is why I am asking for opinions or other
> : ideas as to areas in which I can apply constraint programming in
> : analysing genomic sequences.
> : I have completed a degree in biochemistry so have a fair idea about
> : genetic structure and function and I am now half way through a degree in
> : computing. This program I hope to develop is part of my advanced studies
> : but if the program is useful and works I would like to set it up as a
> : fully supported program.
> : Aaron.

Also look at for Borodovsky's Bayesian
statistics stuff.

The problem you are describing is actually very complex once gets beyond
simple eukaryotic organisms like yeast.

I've been helping my wife with a gene that she has been working on which
spans about 60-70 kilobases of human genomic sequence and has multiple
splice choices at a number of positions. I have found nothing available
that can predict the actual cDNA sequences derived from this genomic
region. The cDNAs (from tissue specific mRNAs ranging in size from 4.5
to 6 kbases) has exons as small as 30 bases in it and the introns go as
big as 20 kilobases.

More information about the Bio-soft mailing list