Internal repeats Q & A
T. Mark Reboul
MARK at CUCCFA.CCC.COLUMBIA.EDU
Fri May 14 06:51:22 EST 1993
Regarding Gaston Gonnet's solution to Wentian Li's internal repeats problem....
Beautiful, not to mention elegant, solution!
However, I wonder if that solution remains "trivial" in practice for
substantially longer sequences, say, sequences with thousands of elements.
Then there is a question of which sorting algorithm to use, how much
computing will be required, and how much dynamic memory is needed for
For one of these larger repeat-finding exercises, is it not possible that
the computing resources required (CPU + memory) can be lessened by exploiting
the specific goal (i.e., that an identification of internal repeats is all
that is sought).
I'm no expert, but there seems to have been a lot of work done on "suffix
trees" and their efficient construction, in the pure c.s. field, related
to lexicographic ordering/sorting problems, and perhaps some of those
algorithms might become useful for handling longer sequences.
Columbia-Presbyterian Cancer Center Computing Facility
mark at cuccfa.ccc.columbia.edu
More information about the Bio-soft