Annotated 20000 Promoters in Human Genome in Genome Explorer

victor at softberry.com victor at softberry.com
Thu Dec 13 17:29:51 EST 2001


This is reply to completely wrong comments:
 Comments:
1. Going for promoters that can be mapped from 5'-COMPLETE cDNAs is the
2. it is not able to detect new genes

  We predict promoters in combination with gene prediction which 
generate old as well as new genes and apply TSSW in 5'- regions of
these 40000 gene set of Human genom draft, but not in 5'-complete cDNA.
So this combination is able to detect new genes.

Regards, Victor

==========================================================================
Subject: Re: Annotated 20000 Promoters in Human Genome in Genome Explorer 
From: scherf at gsf.de (Dr. Scherf)
Organization: http://groups.google.com/ 
Date: 26 Jul 2001 09:43:18 -0700 
Newsgroups: bionet.genome.chromosomes, 
--------------------------------------------------------------------------------
References: <200107111750.AA269877430 at mail.softberry.com> 
--------------------------------------------------------------------------------


At Thursday, July 12, 2001, softberry.com introduced their promoter
prediction approach in this newsgroup. The message contained a
comparison to our promoter prediction tool PromoterInspector but they
did not elucidate the differences between the two approaches. Therefore
we would like to add some comments to their original message:

> Promoters were predicted by Softberry promoter prediction program TSSW
in regions up to 3000 from known starts of coding regions (ATG codon) or
known mapped 5'-mRNA ends.
> We found that limiting promoter search to  such regions drastically
reduces false positive predictions.
> Also, we have very strong thresholds for prediction of TATA-less
promoters to minimize false positive predictions.
>
> Our promoter prediction software accurately predicts about 50%
promoters accurately with a small average deviation from true start
site. Such accuracy makes possible experimental work with found promoter
candidates.
>
> For 20 experimentally verified promoters on Chromosome 22, TSSW
predicted 15, placed 12 of them within (-150,+150) region from true TSS
and 6 (30% of all promoters) - within -8,+2 region from true TSS.
>
> These results are significantly better than those obtained with
PromoterInspector program (Scherf M., Klingenhoff A., Frech K. et al.
(2001) First Pass Annotation of promoters of  human chromosome 22.
Genome Res., 11,333-340), where only 50% promoters from the same sample
were found, with deviations from true TSS ranging from 200 to 1000 bp.

Comments:

1. Going for promoters that can be mapped from 5'-COMPLETE cDNAs is the
undisputed best way to do it.

However, the approach described above has some intrinsic problems:

2. it is not able to detect new genes
3. promoters of genes will be missed or false because cDNAs are often 5' 
incomplete.
4. if they consider 3000 bps upstream of the ATG codon, false
predictions will be made in case of genes with non coding first exon.

Just to give one example out of many others:

The human RCC1 gene (D00591) was found to have 14 exons, 8 of which
(starting from the seventh one) code the RCC1 protein (Furuno N. et al.,
1991, MEDLINE: 92120669). The first 6 exons (about 23.000 base pairs in
length including the introns) are non-coding. See also
http://genomatix.gsf.de/accounts/Help_gems/full_length_cDNA.html 
We analysed 3000bps upstream from the ATG of the genomic sequence of the
RCC1 with TSSW which predicted one promoter in that region.

5. softberry.com did not give any comments about false positive
predicions in their message.

6. PromoterInspector was designed to detect promoter regions in
anonymous sequences - without any annotation. It is an approach which is
neither meant nor attempts to find the exact and entire promoter (which
contains the TSS). What we have in mind with PromoterInspector is going
for promoters in the whole genome and pinning them down approximately to
a region to come in with other tools for detailed analysis. Therefore,
according to our unique prediction quality, a combination of a tool to
search for the TSS and PromoterInspector might overcome the problems
mentioned above.

Dr. Matthias Scherf


-------------------------------------------------
This mail sent through AceDSL WebMail (http://webmail.acedsl.com)

---




More information about the Biochrom mailing list