all internal repeats with mutation

CSHL wchang at phage.cshl.org
Thu May 20 11:15:22 EST 1993


In article <1te1ueINN7vr at phage.cshl.org> wli at phage.cshl.org ( CSHL) writes:
>
>
>thanks for all the postings explaining the finding-all-internal-
>repeats method. so now i know  that in principle, it is a trivial 
>problem (solution: overlapping all its suffix, or, sorting all 
>suffix).
>
>as a follow up question, what about (1) finding all internal
>repeats with at most one mutation?  (2) finding all internal
>repeats with at most a fixed percentage mutation rate (say,
>1 out of 6)? the difference between (1) and (2) is simply
>that in (2) one can allow more number of mutations for
>repeats of longer size. 
>
>i can see that in the "sorting of all suffix method",
>if the mutation occurs at the first site of the suffix,
>it's going to be a problem. it seems OK if the mutation
>occurs in other sites. what's the solution to this? again,
>apology if it's a trivial question....
>
>w.li, CSHL, wli at cshl.org
>
>

Just do a Smith-Waterman type local alignment, i.e. dynamic programming
with appropriate scores (e.g. +1 for a match, -k for a mismatch, -infinity
for gaps if you don't want gaps) and the "zero-trick".  Then, any non-zero
entry in the DP table represents a repeat of type (2).  

There are optimizations one can do, depending on the specification...

-- Bill Chang (wchang at cshl.org)  Cold Spring Harbor Lab., NY




More information about the Bio-soft mailing list