NEW FGENES comparing with GENESCAN gene prediction at CGG group

Victor Solovyev solovyev at sanger.ac.uk
Mon Jul 20 09:56:29 EST 1998


	FGENES and comparisons with GENESCAN
The new version FGENES of multiple gene prediction is available
at http://genomic.sanger.ac.uk/  Computational Genomic Group WEB server
	(http://genomic.sanger.ac.uk/gf/gf.html)

We present there the PERFORMANCE Data of Fgenes compared with GENSCAN
  for several data sets:


Results of prediction on nonredundant dataset of 660 Human sequences
(this set and set of multiple genes will be available for ftp soon;
                                            (Salamov,Solovyev,1998)
    Total number of exons: 3088, nucleotides: 5727044

                              Results averaged over all genes:

                                     Genescan:

                  Sne- 69.8 Spe- 71.1 Sn_n- 92.2 Sp_n 89.9 C- 0.89
                               no prediction cases - 17
               Init: Observed  422 Predicted  370 Correct  232 55%
             Intr: Observed - 2006 Predicted - 2296 Correct  1699 85%
               Term: Observed  422 Predicted - 416 Correct  289 68%
              Sngl: Observed - 238 Predicted - 170 Correct  144 61%

                                       Fgenes:

                  Sne- 69.2 Spe- 68.4 Sn_n- 86.8 Sp_n 89.2 C- 0.86
                               no prediction cases - 21
              Init: Observed - 422 Predicted - 516 Correct  276 65%
             Intr: Observed - 2006 Predicted - 2142 Correct  1596 80%
              Term: Observed - 422 Predicted - 510 Correct  327 77%
              Sngl: Observed - 238 Predicted - 152 Correct  128 54%


Predictions for the longest human gene in the tested sets:

   LOCUS HSPEX 242825 bp Genomic organization of the human PEX gene mutated in
X-linked dominant
                                 hypophosphatemic rickets

Genescan and Fgenes both predicted 11 genes for this sequence in both chains

Genescan predicted 17 correct exons of 22 with correlation coefficient C=0.55

      Correct exons-17 Correct+Partially predicted exons - 17 Sn- 79% Sp- 38%

   Fgenes predicted 17 correct exons of 22 and the other 7 parially with
correlation coefficient C=0.74

  Correct exons-17 Correct+Partially predicted exons - 21 Sn- 90% Sp- 70%


 Remark:1) Exons predicted by both program are significantly more reliable then
predicted by
                each one along; 2) gene prediction problem is not solved yet





Results of prediction on nonredundant dataset of 570 Vertebrate
                        sequences (Burset & Guigo,1996)

 Total number of exons: 2663, nucleotides: 2892140 Human sequences

                            Results are averaged over all genes:

                                       Genescan:

     Sne- 77.7 Spe- 80.8 Sn_n- 93.1 Sp_n 92.8 C- 0.92
     no prediction cases - 8
     Init: Observed - 570 Predicted - 449 Correct - 369 65%
     Intr: Observed - 1523 Predicted - 1688 Correct - 1366 90%
     Term: Observed - 570 Predicted  487 Correct - 431 76%

                                        Fgenes:

     Sne- 79.4 Spe- 77.4 Sn_n- 90.6 Sp_n 92.0 C- 0.91
     no prediction cases - 1
     Init: Observed - 570 Predicted - 568 Correct - 418 73%
     Intr: Observed - 1523 Predicted - 1642 Correct - 1319 87%
     Term: Observed - 570 Predicted - 553 Correct - 440 77%
     Sngl: Observed - 0 Predicted  7 Correct - 0

-- 
Victor Solovyev
The Sanger Centre, Hinxton, Cambridge CB10 1SA, UK
Email: solovyev at sanger.ac.uk  http://genomic.sanger.ac.uk
Phone: 44-1223-494799  FAX:   44-1223-494919




More information about the Bio-www mailing list