DNA-binding protein recognition sequence database?

Dan Jacobson danj at welchgate.welch.jhu.edu
Sat Jan 30 10:52:58 EST 1993


In article <nic-280193154335 at pap.medgen.uu.se> nic at bio.embnet.se (Niclas Jareborg) writes:
>I'm in need of a database of consensus sequence motifs recognized by
>DNA-binding proteins like transcription factors, topoisomerases, or other
>proteins that interact with DNA. 
>Is there such a database?
>
>All help appreciated!
>Nic
>

I answered a similar question on another bionet group yesterday
so I'll include a modified version of that answer here.

There is an Euk. Promoter Database (EPD) and a Transcription Factor
Database (TFD) which you can access in a variety of ways.  If you 
would like to search the database for keywords or phrases you can 
search it by gopher. (If you don't know what gopher is write me a note 
and I'll send you all the information that you need to get it - it's 
free and on the net).

In order to search the EPD Sequence database for keywords point
your gopher at merlot.welch.jhu.edu and go to the following
directory:

-->  12. Search Databases at Welchlab (Cloning Vectors, Euk. Promoters, NRL../

and in that directory read the About-these-searches file and then select

      -->  4.  EPD - Eukaryotic Promoter Database <?>

And now search for whatever keywords you'd like - for example
to retrieve all the entries on the promoters for heatshock proteins
search for

heatshock

to search for all the promoters for heatshock proteins in Drosophila
search for

heatshock and drosophila


and so on....

In order to search the TFD  database for keywords point
your gopher at merlot.welch.jhu.edu and go to the following
directory:

 -->  4.  Genbank, PIR, Swiss_PROT and other Database Searches/

and then select

       -->  18. Search TFD <?>

and search for whatever you'd like.
  
If you'd like to retrieve the entire database and use it
for a fasta search go back to the top directory and select
the following directory:

 -->  2.  FTP Sites For Biology/

and then

      -->  22. NCBI Repository FTP Archive /

and then for the EDP

           -->  5.  EPD/

and then

                -->  4.  db/

then just select the sequence database:

 -->  5.  epd33.seq.

and it will bring it to your system.

It's already in Fasta format so just search your promoters
against it.

Or for the TFD use the same directory (22. NCBI Repository FTP Archive /)

and then select

      -->  14. TFD/

and so on till you have the files that you'd like.

Alternately you can retrieve the epd and tfd databases by
anonymous ftp from ncbi.nlm.nih.gov in the /repository/EPD/db/
and /repository/TFD directories.


Best of luck,

Dan Jacobson
                      
danj at welchgate.welch.jhu.edu

Johns Hopkins University
                        



More information about the Comp-bio mailing list