Frequency of BstE II cutting?

Chris Boyd chrisb at
Tue Jul 9 05:54:03 EST 1996

Mikhail Alexeyev (malexeyev at wrote:
: In article <4rasu5$mb6 at>, chrisb at (Chris
: Boyd) wrote:

: > Mikhail Alexeyev (malexeyev at wrote:
:  (snip)

: > : Very true, but the original post implied to DISREGARD these considerations
: > : by asking to compare frequency of BstEII cutting with that of a generic
: > : six-cutter (1 in 4096 disregarding sequence distribution considerations),
: > : didn't it? 
: > 
: > Mikhail, I realised that, but I thought it would be useful to give some
: > extra relevant information.  In particular, I took the original
: > poster's caveat about sequence distributions to refer to the substrate
: > DNA, not the recognition sequence.  I was pointing out (inter alia)
: > that you have to take account of the recognition sequence also, even if
: > the substrate DNA is constant and of random sequence.  In such a
: > substrate, for example, BssHII (GCGCGC) sites will occur more
: > frequently than BstEII sites.
: > 
: Chris, I must disagree.  
: First, BOTH you and Bill refer to substrate DNA. Let me quote you:

: > : > In reality, however, the occurrence
: > : > frequency of any given query sequence is markedly affected by the base
: > : > composition and sequence microstructure (CpG islands etc.) of the
: > : > target DNA.

: Second, although I am not sure what do you mean by constant DNA, from

By `constant' I meant the same sequence.

: everything mentioned in this thread so far and from quick glance at two
: articles you are referring to I do not see why BstEII would cut not as
: frequent as BssHII in a RANDOM sequence. Success of Markov chain analysis
: (2-nd order and up) is in the fact that it takes advantage of
: experimentally determined nonrandomness of nucleotide distribution. If
: sequence is really random, this correction (I think) will not be
: necessary. 

OK, it's a poor example, referring simply to the occurrence of
sequences like ...GCGCGCGC... in a sufficiently large random sequence.
(By constant I meant the same sequence.)  That's two BssHII sites in 8
nucleotides.  You can _never_ get BstEII sites that overlap in this
way, hence my assertion. It's easy to pick other pairs of enzyme
recognition sequences where this effect would be more pronounced. 
Practically, of course, this is usually of no significance.

:  In general, I beleive that composition of recognition sequence is
: secondary to genome structure and may be ignored.

Agreed.  Sorry for being so pedantic...

Best wishes,
Chris Boyd                       | from, | MRC Human Genetics Unit
chrisb at             |  not  |  Western General Hospital |   for |   Edinburgh EH4 2XU, SCOTLAND

More information about the Methods mailing list