NEW FGENES comparing with GENESCAN gene prediction at CGG group

Victor Solovyev solovyev at sanger.ac.uk
Wed Jul 22 23:43:51 EST 1998


	FGENES and comparisons with GENESCAN
The new version FGENES of multiple gene prediction is
available at http://genomic.sanger.ac.uk/  Computational Genomic Group WEB
server	(http://genomic.sanger.ac.uk/gf/gf.html)

We present there the PERFORMANCE Data of Fgenes comparing with GENSCAN
  for several data sets:

Results of prediction on nonredundant dataset of 660 Human sequences
(this set and set of multiple genes will be available for ftp soon;
                                            (Salamov,Solovyev,1998)
    Total number of exons: 3088, nucleotides: 5727044

                              Results averaged over all genes:

                                     Genescan:

                  Sne- 69.8 Spe- 71.1 Sn_n- 92.2 Sp_n 89.9 C- 0.89
                               no prediction cases - 17
               Init: Observed =96 422 Predicted =96 370 Correct =96 232 5=
5%
             Intr: Observed - 2006 Predicted - 2296 Correct =96 1699 85%
               Term: Observed =96 422 Predicted - 416 Correct =96 289 68%=

              Sngl: Observed - 238 Predicted - 170 Correct =96 144 61%

                                       Fgenes:

                  Sne- 69.2 Spe- 68.4 Sn_n- 86.8 Sp_n 89.2 C- 0.86
                               no prediction cases - 21
              Init: Observed - 422 Predicted - 516 Correct =96 276 65%
             Intr: Observed - 2006 Predicted - 2142 Correct =96 1596 80%
              Term: Observed - 422 Predicted - 510 Correct =96 327 77%
              Sngl: Observed - 238 Predicted - 152 Correct =96 128 54%


Predictions for the longest human gene in the tested sets:

   LOCUS HSPEX 242825 bp Genomic organization of the human PEX gene mutat=
ed in
X-linked dominant
                                 hypophosphatemic rickets

Genescan and Fgenes both predicted 11 genes for this sequence in both cha=
ins

Genescan predicted 17 correct exons of 22 with correlation coefficient C=3D=
0.55

             Correct exons-17 Correct+Partially predicted exons - 17 Sn- =
79%
Sp- 38%

   Fgenes predicted 17 correct exons of 22 and the other 7 parially with
correlation coefficient C=3D0.74

             Correct exons-17 Correct+Partially predicted exons - 21 Sn- =
90%
Sp- 70%


 Remark:1) Exons predicted by both program are significantly more reliabl=
e then
predicted by
                each one along; 2) gene prediction problem is not solved =
yet





Results of prediction on nonredundant dataset of 570 Vertebrate
                        sequences (Burset & Guigo,1996)

 Total number of exons: 2663, nucleotides: 2892140 Human sequences

                            Results are averaged over all genes:

                                       Genescan:

     Sne- 77.7 Spe- 80.8 Sn_n- 93.1 Sp_n 92.8 C- 0.92
     no prediction cases - 8
     Init: Observed - 570 Predicted - 449 Correct - 369 65%
     Intr: Observed - 1523 Predicted - 1688 Correct - 1366 90%
     Term: Observed - 570 Predicted =96 487 Correct - 431 76%

                                        Fgenes:

     Sne- 79.4 Spe- 77.4 Sn_n- 90.6 Sp_n 92.0 C- 0.91
     no prediction cases - 1
     Init: Observed - 570 Predicted - 568 Correct - 418 73%
     Intr: Observed - 1523 Predicted - 1642 Correct - 1319 87%
     Term: Observed - 570 Predicted - 553 Correct - 440 77%
     Sngl: Observed - 0 Predicted =96 7 Correct - 0

-- 
Victor Solovyev
The Sanger Centre, Hinxton, Cambridge CB10 1SA, UK
Email: solovyev at sanger.ac.uk  http://genomic.sanger.ac.uk
Phone: 44-1223-494799  FAX:   44-1223-494919




More information about the Bionews mailing list