New annotation of Drosophila genes : Infogene
solovyev at sanger.ac.uk
Tue May 2 06:14:54 EST 2000
Gene Predictions for Drosophila genome are available at:
under CGG genomics WEB server: http://genomic.sanger.ac.uk/
the predictions by Fgenesh program (Salamov,Solovyev,1999)
Interesting that ab initio predictions by fgenesh produced 2 thousands more genes
and 8277 more exons just on large
scaffolds than annotated by collaboration of Celera scientists and academia coauthors.
Because any gene prediction
approach is not perfect it will be useful to
analyze all different predictions to identify new genes.
You can ftp data for your further analysis!
Chromosome X, 2,3,4
predicted genes and similarity data in INFOGENE format
predicted proteins in fasta format
predicted exon sequences in fasta format
predicted exon amino acid sequences in fasta format
Due to high accuracy exon prediction and significantly less accurate assigning
exons to a particular gene, exon sequences itself present value to experimental
gene verification or Other projects
Visual representation of Predicted genes as well as ALL KNOWN GENES could be
seen in gene centred database INFOGENE through Java viewer.
This database includes genes constructed often from many GenBank entries.
Divisions with separate collections for model organisms include:
Human genes data *New
Other Primates genes data
Mus musculus genes data
Other Rodenta genes data
Other Mammalia genes data * New
Danio rerio genes data * New
Fugu rubripes genes data
Other Vertebrata genes data
Drosophila melanogaster genes data
Caenorhabditis elegans genes data
Other Invertebrata genes data
Saccharomyces cerevisiae genes data *New
Schizosaccharomyces pombe genes data
Arabidopsis thaliana genes data
Oriza Sativa genes data
Zea Mays genes data
Other eukaryotes from GB *.pln genes data
Annotation of Drosophila Melanogaster 2.9 MB ADH region
Included Drosophila melanogaster ADH 2.9 MB genomic region automatic annotation
using FGENES and FGENESH: Fgenes predictions, Fgenesh predictions, CGG1 -
summary prediction using both mention above and std3 - manual annotation based o
nexperimental data (some computational) by Ashburner et al. (1999).
This example shows problems with genomic annotation: 90% of actual coding
sequences predicted accurately, but exons often combined very different
from real genes.
- You can save an Infogene record using Action menu and Obtain Infogene
locus option (with or without sequence)
- Realized search of context (select Search filds (among many specific
lines of Infogen database) and print your word in left down corner)
For example you can find all genes which have start of transcription
annotated in GeneBank: Select Context in Option menu,
select onlt TSP field in SearchFilds, put * in search window and Enter.
To see all information about a gene in the locus:
Put mouse pointer to gene block
in upper window and push and keep right mouse button
(shift key + push and click right mouse button will permanently
show this information)
LocusInfo button will show a head of locus which shows how many GenBank
entries are used for gene description
More information about the Bio-www