PDB - GCG or GENBANK accession number tables
Dan Jacobson
danj at welchdev.welch.jhu.edu
Wed Jul 28 17:07:08 EST 1993
In article <28JUL199312344982 at aardvark.ucs.uoknor.edu> bfrank at aardvark.ucs.uoknor.edu (FRANK,BART) writes:
>Can someone suggest a fast method to obtain the amino acid and/or
>nucleotide sequences of particular proteins in pdb? Is there a
>table listing for the pdb numbers and accession numbers for GCG or
>Genbank/EMBL files?
>
For protein sequences you can search NRL_3D (a Protein Sequence-Structure
Database) via gopher. The entire documentation set of each emtry is
searchable so you can search by PDB accession number, or a title,
keyword ....
Point your gopher client at merlot.welch.jhu.edu and select:
13. Search Databases at Hopkins (Vectors, Promoters, NRL-3D, EST, OMIM ../
--> 10. Sequence Databases (Vectors, EPD, EST, NRL_3D, Kabat, Genbank)/
--> 7. NRL_3D Protein Sequence-Structure Database <?>
Now search for a PDB accession number or a topic of interest - for example
try:
kinase
and you'll see -
--> 1. 2CPKE c-AMP-dependent protein kinase (cAPK) (catalytic.
2. 2CPKI cAMP-dependent protein kinase inhibitor, chain I -.
3. 6ENL Enolase (2-phospho-D-glycerate hydrolase) Complex.
4. 1AK3A Nucleoside-triphosphate--adenylate kinase isoenzyme.
5. 1AK3B Nucleoside-triphosphate--adenylate kinase isoenzyme.
6. 1APK Protein kinase I (domain A) - Bovine #EC-number.
7. 1BPK Protein kinase I (domain B) - Bovine #EC-number.
8. 1CPKE Protein kinase, chain E - Mouse #EC-number 2.7.1.37.
9. 1CPKI cAMP-dependent protein kinase inhibitor, chain I -.
10. 2APK Protein kinase II (domain A) - Bovine #EC-number.
11. 2BPK Protein kinase II (domain B) - Bovine #EC-number.
12. 3ADK Adenylate kinase - Pig #EC-number 2.7.4.3.
13. 3ENL Enolase (2-phospho-D-glycerate hydrolase) (apo) -.
14. 3PGK Phosphoglycerate kinase complex with atp, Magnesium.
15. 4ENL Enolase (2-phospho-D-glycerate hydrolase) (holo) -.
16. 5ENL Enolase (2-phospho-D-glycerate hydrolase) Complex.
17. 7ENL Enolase (2-phospho-D-glycerate hydrolase) Complex.
an entry looks as follows:
---------------
ENTRY 2CPKE #Type Protein
TITLE c-AMP-dependent protein kinase (cAPK) (catalytic
subunit), chain E - Mus musculus (recombinant
mouse) #EC-number 2.7.1.37
DATE 19-Feb-1993 #Sequence 19-Feb-1993 #Text 31-Mar-1993
PLACEMENT 0.0 0.0 0.0 0.0 0.0
COMMENT PDB code: 2CPK
SOURCE Mus musculus #Common-name house mouse
COMMENT Note: "alpha" isoenzyme expressed in (escherichia
coli)
REFERENCE
#Authors Knighton D.R., Zheng J., Ten Eyck L.F., Ashford
V.A., Xuong N.H., Taylor S.S., Sowadski J.M.
#Citation coordinates deposited in Brookhaven National
Laboratory's Protein Data Bank
REFERENCE
#Authors Knighton D.R., Zheng J., Ten Eyck L.F., Ashford
V.A., Xuong N.H., Taylor S.S., Sowadski J.M.
#Journal Science (1991) 253:407
#Title Crystal structure of the catalytic subunit of cyclic
adenosine monophosphate-Dependent protein kinase.
REFERENCE
#Authors Knighton D.R., Zheng J., Ten Eyck L.F., Xuong N.H.,
Taylor S.S., Sowadski J.M.
#Journal Science (1991) 253:414
#Title Structure of a peptide inhibitor bound to the
catalytic subunit of cyclic adenosine
monophosphate-Dependent protein kinase.
REFERENCE
#Authors Slice L.W., Taylor S.S.
#Journal J. Biol. Chem. (1989) 264:20940
#Title Expression of the catalytic subunit of
cAMP-dependent protein kinase in escherichia coli.
COMMENT Resolution: 2.7 angstroms
COMMENT R-value: 0.18
COMMENT Determination: X-ray diffraction
KEYWORDS Transferase(phosphotransferase)
FEATURE
2-17 #Region helix (right hand alpha)\
26-28 #Region helix (right hand 3-10) (not
noted in ref 1)\
62-67 #Region helix (right hand alpha)\
71-83 #Region helix (right hand alpha)\
114-121 #Region helix (right hand alpha)\
126-145 #Region helix (right hand alpha)\
155-157 #Region helix (right hand 3-10) (not
noted in ref 1)\
188-190 #Region helix (right hand 3-10) (not
noted in ref 1)\
193-196 #Region helix (right hand alpha) (not
noted in ref 1)\
204-219 #Region helix (right hand alpha)\
229-238 #Region helix (right hand alpha)\
249-258 #Region helix (right hand alpha)\
263-265 #Region helix (right hand 3-10) (not
noted in ref 1)\
275-278 #Region helix (right hand alpha)\
281-285 #Region helix (right hand 3-10) (not
noted in ref 1)\
288-292 #Region helix (right hand alpha)\
29-37,41-48,53-61,
101-107,92-97 #Region beta sheet\
148-149,158-160 #Region beta sheet\
166-168,175-176 #Region beta sheet
SUMMARY #Molecular-weight 39110 #Length 336 #Checksum 7934
SEQUENCE
5 10 15 20 25 30
1 V K E F L A K A K E D F L K K W E T P S Q N T A Q L D Q F D
31 R I K T L G T G S F G R V M L V K H K E S G N H Y A M K I L
61 D K Q K V V K L K Q I E H T L N E K R I L Q A V N F P F L V
91 K L E F S F K D N S N L Y M V M E Y V A G G E M F S H L R R
121 I G R F S E P H A R F Y A A Q I V L T F E Y L H S L D L I Y
151 R D L K P E N L L I D Q Q G Y I Q V T D F G F A K R V K G R
181 T W T L C G T P E Y L A P E I I L S K G Y N K A V D W W A L
211 G V L I Y E M A A G Y P P F F A D Q P I Q I Y E K I V S G K
241 V R F P S H F S S D L K D L L R N L L Q V D L T K R F G N L
271 K N G V N D I K N H K W F A T T D W I A I Y Q R K V E A P F
301 I P K F K G P G D T S N F D D Y E E E E I R V S I N E K C G
331 K E F T E F
---------------
Thus you have the protein sequence. The dna sequence is a bit harder -
you would need to run this sequence through the PIR or Genbank (genpept)
Fasta or Blast e-mail-servers to find a match - and then use gopher to pull out
the full entries found with the Fasta/Blast searches.
If you've never heard of gopher write me a note and I'll send you some
information to get you started.
Best of luck,
Dan Jacobson
danj at welchgate.welch.jhu.edu
Johns Hopkins University
More information about the Bio-soft
mailing list