Transcription Factor Databases

Dan S. Prestridge danp at BEAGLE.COLORADO.EDU
Mon Feb 24 11:44:33 EST 1992

>I'm very interested in being able to use this database and would greatly
<appreciate it if anyone who has had experience with it would share their
<knowledge with us.  The TIBS article described many
<improvements in the overall concept of the database but, unlike the NAR
<paper, there was no advise on how to get practical use out of anything
<other than the SITES table (bsically, just feed it to your favorite
<search program, like GCG Find).  In fact, most of the really exciting
<applications seem to be mostly pie-in-the-sky work-in-progress.  How do
<the rest of you working in transcription regulation feel about this?  The TFD
<concept is the only suggestion I've encountered of dealing with the mountain
>of data that seems to have accumulated.

There are at present two computer-based transcription factor
databases, Ghosh's TFD as you  have mentioned, and another
developed by Edgar Wingender (Adv. Mol. Gen. 4, 95-108, 1991). 
They are similar in size and content.  The current advantages of
 Ghosh's TFD is that it is easily available,  the sites
information comes in a format the can be used by the GCG
programs, and it is used by my own program, SIGNAL SCAN (CABIOS
7: 203-206, 1991).  The current application of  transcription
factor databases in locating functional transcriptional elements
or promoter sequences is limited.  Searches of any DNA sequence
using the databases reveals a very high number of false positive
sites (Wingender et al., Adv. Mol. Gen. 4, 95-108, 1991;
Prestridge and Burks, 1992, in prep.).  Although there are
approximately  50% more transcription sites found in promoter
sequences that non-promoter sequences, the predictions of
transcription factor binding sites cannot be used to directly
discriminate promoter from non-promoter sequences (Prestridge
and Burks, 1992, in prep.).  Work is now underway to attempt to
use patterns of these elements to recognize promoter sequences.

The best current known uses of the transcription factor
databases are:

1) If you have a sequence that binds a protein and you do not
know what protein it may be; in this case several investigators 
have reported to me success in using these databases in finding
candidate proteins.

2) You have a sequence that has some known regualtory properties
and want to find specific sites that might be involved with that

3) You have a known promoter sequence and what to find out some
possible regulatory sites that you may not be aware of.

The use of transcription factor databases alone will probably
never be able to ascertain functional transcriptional elements
from non-functional sites that mimic them because of the need of
transcription factors to interact with other factors  at other
sites (Johnson and McKnight, Annu. Rev. Biochem.,  58: 799-839)
to define their functionality.  This will require the
development of computer programs to recognize functional
patterns of these elements.

I hope this helps.  If anyone wishes to get a copy of SIGNAL SCAN (which
includes the current Ghosh TFD) please contact me.  Dr. Edger Wingender
can be contacted at ewi at (he has a PC
version his database available). 



