all internal repeats with mutation
wchang at phage.cshl.org
Thu May 20 11:15:22 EST 1993
In article <1te1ueINN7vr at phage.cshl.org> wli at phage.cshl.org ( CSHL) writes:
>thanks for all the postings explaining the finding-all-internal-
>repeats method. so now i know that in principle, it is a trivial
>problem (solution: overlapping all its suffix, or, sorting all
>as a follow up question, what about (1) finding all internal
>repeats with at most one mutation? (2) finding all internal
>repeats with at most a fixed percentage mutation rate (say,
>1 out of 6)? the difference between (1) and (2) is simply
>that in (2) one can allow more number of mutations for
>repeats of longer size.
>i can see that in the "sorting of all suffix method",
>if the mutation occurs at the first site of the suffix,
>it's going to be a problem. it seems OK if the mutation
>occurs in other sites. what's the solution to this? again,
>apology if it's a trivial question....
>w.li, CSHL, wli at cshl.org
Just do a Smith-Waterman type local alignment, i.e. dynamic programming
with appropriate scores (e.g. +1 for a match, -k for a mismatch, -infinity
for gaps if you don't want gaps) and the "zero-trick". Then, any non-zero
entry in the DP table represents a repeat of type (2).
There are optimizations one can do, depending on the specification...
-- Bill Chang (wchang at cshl.org) Cold Spring Harbor Lab., NY
More information about the Bio-soft