New annotation of Drosophila genes : Infogene

Victor Solovyev solovyev at
Tue May 2 06:14:54 EST 2000

 Gene Predictions for Drosophila genome are available at:
            under CGG genomics WEB server:    

 the predictions by Fgenesh program (Salamov,Solovyev,1999) 

Interesting that ab initio predictions by fgenesh  produced 2 thousands more genes
 and 8277 more exons just on large
scaffolds than annotated by collaboration of Celera scientists and academia coauthors.
 Because any gene prediction 
approach is not perfect it will be useful to
analyze all different predictions to identify new genes. 

 You can ftp data for your further analysis! 
  Chromosome X, 2,3,4

  predicted genes and similarity data in INFOGENE format 
  predicted proteins in fasta format 
  predicted exon sequences in fasta format 
  predicted exon amino acid sequences in fasta format 
Due to high accuracy exon prediction and significantly less accurate assigning 
exons to a particular gene, exon sequences itself present value to experimental 
gene verification or Other projects 
  Visual representation of Predicted genes as well as ALL KNOWN GENES could be 
seen in gene centred database INFOGENE through Java viewer.
    This database includes genes constructed often from many GenBank entries. 
Divisions with separate collections for model organisms include:

Human genes data   *New
Other Primates genes data
Mus musculus genes data
Other Rodenta genes data
Other Mammalia genes data       * New
Danio rerio genes data          * New
Fugu rubripes genes data
Other Vertebrata genes data
Drosophila melanogaster genes data
Caenorhabditis elegans genes data
Other Invertebrata genes data
Saccharomyces cerevisiae genes data *New
Schizosaccharomyces pombe genes data
Arabidopsis thaliana genes data
Oriza Sativa genes data
Zea Mays genes data
Other eukaryotes from GB *.pln genes data
Annotation of Drosophila Melanogaster 2.9 MB ADH region 

Included Drosophila melanogaster ADH 2.9 MB genomic region automatic annotation
using FGENES and FGENESH: Fgenes predictions, Fgenesh predictions, CGG1 -
summary prediction using both mention above and std3 - manual annotation based o
nexperimental data (some computational) by Ashburner et al. (1999).
  This example shows problems with genomic annotation: 90% of actual coding
sequences predicted accurately, but exons often combined very different
from real genes.

- You can save an Infogene record using Action menu and Obtain Infogene
locus option (with or without sequence)
- Realized search of context (select Search filds (among many specific
  lines of Infogen database) and print your word in left down corner)
  For example you can find all genes which have start of transcription
  annotated in GeneBank: Select Context in Option menu,
  select onlt TSP field in SearchFilds, put * in search window and Enter.

To see all information about a gene in the locus:
    Put mouse pointer to gene block
   in upper window and push and keep right mouse button
   (shift key + push and click right mouse button will permanently
    show this information)
LocusInfo button will show a head of locus which shows how many GenBank
entries are used for gene description


More information about the Bio-www mailing list