Landau-Vishkin k-mismatch string matching

Jacob Engelbrecht engel at biobase.aau.dk
Thu Aug 26 04:18:13 EST 1993


In <142946 at netnews.upenn.edu> tisdall at amalthea.humgen.upenn.edu (James Tisdall) writes:

>For a popular approx. string matching utility, try "agrep".  It is line
>oriented - it won't find matches across newlines.  It is available
>on many sites - use "archie agrep" or a gopher "veronica" search.
>It comes from the University of Arizona.

It is true that the agrep program is able to do part of the things
for k-mismatch string search, but the source code is not well suited
for working with large DNA strings (~1,000,000 characters) where
you want the actual position of the match, and not the whole sequence.

I am interested in a function returning the position of a match in a
string allowing for k mismatches (possibly allowing regular expressions
and indels) not a whole program.  I am not able to extract this part
of the agrep program, because the actual search is mixed with reading
files etc. in the agrep program (which is by the way a great substitute
for the grep program -- better and faster).

Jacob Engelbrecht




More information about the Bio-soft mailing list