Genetic analysis program

tom chappell t.chappell at ucl.ac.uk
Thu Jul 23 08:02:25 EST 1998


> Aaron J Beverley (ajbever at powerup.com.au) wrote:
> : I am considering writing a program which takes an arbitrary length of
> : DNA or RNA and will return possible genes from these genomic sequences.
> : I hope it will be able to determine the most probable ORF's, introns,
> : exons and possibly determine firstly a primary protein sequence then
> : secondary sequence.
> : I'm not sure if this sort of program could be useful for anyone, or if I
> : am being a bit ambitious, this is why I am asking for opinions or other
> : ideas as to areas in which I can apply constraint programming in
> : analysing genomic sequences.
> : I have completed a degree in biochemistry so have a fair idea about
> : genetic structure and function and I am now half way through a degree in
> : computing. This program I hope to develop is part of my advanced studies
> : but if the program is useful and works I would like to set it up as a
> : fully supported program.
> 
> : Aaron.
> 

Also look at http://exon.gatech.edu/ for Borodovsky's Bayesian
statistics stuff.

The problem you are describing is actually very complex once gets beyond
simple eukaryotic organisms like yeast.

I've been helping my wife with a gene that she has been working on which
spans about 60-70 kilobases of human genomic sequence and has multiple
splice choices at a number of positions. I have found nothing available
that can predict the actual cDNA sequences derived from this genomic
region. The cDNAs (from tissue specific mRNAs ranging in size from 4.5
to 6 kbases) has exons as small as 30 bases in it and the introns go as
big as 20 kilobases.




More information about the Bio-soft mailing list