Dr. Duncan Clark <Duncan at nospam.demon.co.uk> wrote in message
news:DwEuJhAPdvg4EA7h at genesys.demon.co.uk...
>> OK so what we still need is a simple way to calculate errors and not in
> the linear fashion that Stratagene erroneously published. I think that
> many moons ago Boehringer published a table in one of their Biochemica's
> (shortly after the launch of Expand) showing the percentage of products
> with errors for Taq and Expand after 15, 20, 25 and 30 cycles and for
> products from 100bp to over 1000bp.
>> There is also another Boehringer Biochemica (April 95, pp 34-35) that
> calculates error rates in a lacI based system for various polymerases in
> PCR. Basically they used a lacI containing pUC19 (white) and PCR'd the
> whole plasmid. Any errors introduced in lacI that inactivate lacI will
> give blue or pale blue colonies upon ligation and transformation of the
> amplified plasmid into an appropriate host.
>> The error rate per bp was calculated with a re-arranged equation from
> Keohavong and Thilly (PNAS 86, 9253).
>> f= -ln F / d x b bp
>> where F is the fraction of white colonies
>> d is the number of duplications
>> 2(superscript)d = output DNA/input DNA
>> b is the effective target size of the (1080bp) lacI gene i.e. 349 bp
>> There are 349 phenotypically identified single base substitutions
> (nonsense and and mis-sense) at 179 codons (approx 50% of coding region)
> within the lacI gene.
>> From this one can calculate the error rate which for Taq comes out at
> around 2.6 x 10E-5 and for Pwo (Pfu) 3.2 x 10E-6
>> So what's the point. Presumably with more rearrangement, assuming 20
> duplications, a 3kb target size and the above error rates one can
> calculate the % fraction with errors. I'll leave someone else to do the
thank you for the references ; I'll look them up. The points for me are that
I would like to have , at least some idea , about the source and propagation
of errors in a sequencing project. (Involving PCR and cycle sequencing
steps). Second I'm just curious. Third math's is one of my hobby's. The
total number of wrong copied bases in a long dna molecule can be regarded as
"a number of events in a space or time dimension" . It is thus very probable
that the errors are poisson distributed. For a poisson distribution :
P(k)=exp (-n*p) * (n*p) ^ k / k! Where in our case P(k)=the chance for k
errors in the whole molecule , n=number of bases , p=error rate per base ;
your f. For k=0 this leads to : P(0)=exp (-number of base pairs * error rate
per base pair) . As (n*p)^0=0 and 0!=0 . P(0) is the fraction of colonies ;
your F. Taking the natural log , rearranging and dividing by by number of
doublings leads to the formula you gave. In my posting of 12-jan I proposed
to calculate the P(0) (your F) directly from chance theory with :
(1-p_wrong)^number of base pairs . For a 3000bp target and an error rate of
2.6 x 10E-5
exp(-2.6 x E-5 x 3000) = 0.924964 Poisson
(1-2.6 x 10E-5)^3000=0.924963 Direct
This make me feel a bit less sliced.
Which makes me happy