Does anyone know what exactly sequence divergence estimates mean in
terms of base pair mismatch? For instance, does a figure of 12%
divergence between the same stretch of DNA in two bacterial strains mean
that 12 percent of the bases in identical positions in the two strands,
are different? By this criterion, two completely unrelated sequences
should display a theoretical maximum of 75% divergence, if the sequence
was long enough (with 4 possible bases, mismatches would occur in 3 out
of 4 cases).
Is this the method used to generate similarity estimates? Any assistance
in this would be welcome.
--
Paul D. Roughan