Annotated 20000 Promoters in Human Genome
webmaster
webmaster at softberry.com
Wed Jul 11 16:50:48 EST 2001
Annotated 20000 Promoters in Human Genome in Genome
Explorer
Promoters predicted in December draft of human
genome are presented in
Genome Explorer along with known and predicted
genes:
http://softberry.com/genomd/chrvis.html
Right mouse click on a promoter in Genome Explorer
reveals promoter
sequence, presented in two blocks for TATA+
promoters and in one block for
TATA-less promoters. First block of TATA+ promoter
is TATA-box, and the
second is a stretch from predicted transcription
start site (TSS) to known 5'-end
of mRNA or translation start site.
Promoters were predicted by Softberry promoter
prediction program TSSW in
regions up to 3000 from known starts of coding
regions (ATG codon) or known
mapped 5'-mRNA ends. We found that limiting promoter
search to such regions
drastically reduces false positive predictions.
Also, we have very strong
thresholds for prediction of TATA-less promoters to
minimize false positive
predictions.
Our promoter prediction software accurately
predicts about 50% promoters
accurately with a small average deviation from true
start site. Such accuracy
makes possible experimental work with found promoter
candidates.
For 20 experimentally verified promoters on
Chromosome 22, TSSW predicted
15, placed 12 of them within (-150,+150) region
from true TSS and 6 (30% of
all promoters) - within -8,+2 region from true TSS.
These results are significantly better than those
obtained with PromoterInspector
program (Scherf M., Klingenhoff A., Fresch K. et al.
(2001) First Pass
Annotation of promoters of human chromosome 22.
Genome Res., 11,333-
340), where only 50% promoters from the same sample
were found, with
deviations from true TSS ranging from 200 to 1000
bp.
We predicted 17632 TATA+ promoter and 2383 TATA-less
promoters overall
in human genome draft. For Chromosome 22, we
predicted 350 TATA+
promoters and 85 TATA-less promoters.
New Fgenesh++ gene predictions for December draft of
human genome are
presented by Softberry Inc. (www.softberry.com) at
http://genome.cse.ucsc.edu/goldenPath/decTracks.html
and will be presented in
Softberry Genome Explored with some expression data
soon
(http://softberry.com/genomb/chrvis.html)
44409 genes include 5883 genes correponding to
refseq mRNA, 3592 genes
corresponding to GenBank mRNAs, 2047 known genes and
302 pseudogenes.
Methods of predictions are described at:
Solovyev V.V. (2001) Statistical approaches in
Eukaryotic gene prediction. In:
Handbook of Statistical genetics (eds. Balding D. et
al.), John Wiley & Sons,
Ltd., p. 83-127.)
---
=====
Moderated
bionet.genome.gene-structure
____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie
More information about the Genstruc
mailing list