Internal repeats Q & A

Fri May 14 06:51:22 EST 1993

Regarding Gaston Gonnet's solution to Wentian Li's internal repeats problem....

Beautiful, not to mention elegant, solution!

However, I wonder if that solution remains "trivial" in practice for 
substantially longer sequences, say, sequences with thousands of elements.
Then there is a question of which sorting algorithm to use, how much 
computing will be required, and how much dynamic memory is needed for 
temporary data.

For one of these larger repeat-finding exercises, is it not possible that 
the computing resources required (CPU + memory) can be lessened by exploiting
the specific goal (i.e., that an identification of internal repeats is all
that is sought).

I'm no expert, but there seems to have been a lot of work done on "suffix
trees" and their efficient construction, in the pure c.s. field, related 
to lexicographic ordering/sorting problems, and perhaps some of those 
algorithms might become useful for handling longer sequences.

Mark Reboul
Columbia University
Columbia-Presbyterian Cancer Center Computing Facility
mark at

More information about the Bio-soft mailing list