Formula to determine coverage

Tom Knight tk at
Sun Nov 26 16:16:27 EST 2000

> Quite a qhile I remember seeing a formula that determines the minimum
> number of sub clones required of a certain size that should, within a
> certain confidence limit, give complete coverage of an insert of given
> length. It involved natural logs and things like that.

To get a desired probability P of having a section of your genome in
the library, where the library consists of clones each having a
fraction f of the original genome, you need a number of clones

       ln (1-P)
N =  ------------
       ln (1-f)

So, to get a 99% probability ( a common choice )

                    ln (1-.99)
N =  ------------------------------------------------
       ln (1 - (<size of insert> / <size of genome>))


