New_BCM_Gene_finde_for_mult_genes

Victor Solovyev solovyev at sanger.ac.uk
Thu Oct 16 18:05:41 EST 1997


New test Version of BCM Gene_finder to search for multiple genes,
  Promoters and polyA sites is available from:
  
  
http://defrag.bcm.tmc.edu:9503/gene-finder/gf.html
----------------------------------------------------
(LPT WWW main page: http://defrag.bcm.tmc.edu:9503/lpt.html)

  GeneFinder will soon be available from The Sanger Centre WWW
  
  Comments to my new address: solovyev at sanger.ac.uk


Short Description (Abstract for Atlanta conference 1997 november)



  A new version of GeneFinder for analysis of genomic DNA with multiple genes


    Genome sequencing projects generate long genomic sequences often containing 
several genes. The Baylor College of Medicine GeneFinder programs were developed
further to predict multiple genes. We calculate new linear discriminant function
s 
to predict 5'- , internal and 3'-exons for 4 different G+C compositional sequenc
e
groups. We introduce a list of rules a) for competition of overlapping exons and
b) for joining neighbor exons. The dynamic programming algorithm was changed
from searching acyclic graph of compatible exons to linear time algorithm which
uses a knowledge of the maximal path for preceding exons for each of 3 possible
open reading frames (in one DNA chain direction). New program FGENES reach about
93% accuracy (Sn=92% and Sp=94%) at the nucleotide level and 84% accuracy at the
 
exact exon prediction level (Sn=84.7 and Sp=84) on Burset/Guigo set of 570 genes
.
Gene-level accuracy was 57%, that is more accurate than observed for the existin
g
programs which do not use sequence homology information. Promoter and poly-A sit
e 
prediction were embedded in the program, that helps to recognize multiple genes 
located in one sequence. TATA-box containing and TATA-less promoters prediction 
will be discussed.
*) Current address:  Sanger Centre, Hinxton, Cambridge CB10 1SA, UK




More information about the Bio-soft mailing list