The following is a table for some common restriction enzyme recognition
sites in the Arabidopsis genome. This table was composed using Jamie Cuticchia's
computer program and data I obtained through searches. The table itself was
made by John McDowell using the information from the computer printouts of dAta.
ENZYME RECOGNITION SITE FREQUENCY AVE SIZE(BP)
1 Apa I GGGCCC 0.0026 38,460
2 Xma I CGGCCG 0.0053 18,870
3 Sma I CCCGGG 0.0066 15,150
4 Sac II CCGCGG 0.0092 10,860
5 Kpn I GGTACC 0.0118 8,470
6 Xho I CTCGAG 0.0144 6,940
7 Bam HI GGATCC 0.0144 6,940
8 Xba I TCTAGA 0.0158 6,330
9 SaII/HincII GTCGAC 0.0158 6,320
10 Spe I ACTAGT 0.0197 5,070
11 Sac I GAGCTC 0.0249 4,016
12 Pst I CTGCAG 0.0289 3,460
13 Eco RV GATATC 0.0302 3,310
14 Eco RI GAATTC 0.0368 2,590
15 Cla I ATCGAT 0.0394 2,530
16 Hind III AAGCTT 0.0617 1,620
17 AhaIII/DraI TTTAAA 0.0703 1,422
If you have any questions my email address is
DEANRE%gandal.dnet at SERVER.uga.edu
The table was put together using the known sequences of Arabidopsis as found
in Genbank and Uembl. The computer program (which fits a markov chain) takes
these sequences and searches for trinucleotide, tetranucleotide, and
hexanucleotide counts (compares random to expected). John took the
hexanucleotide counts and looked for common restriction sites which he then
put in table form.
Best of luck,
Rob