Sources available for interesting high-speed finite-state machines
behoffski at grouse.com.au
Thu Nov 20 14:46:26 EST 1997
I've published a new finite-state machine technique in the
November issue of Dr.Dobb's Journal that you might be
interested in. The machine architecture combines
table-driven finite-state machines with threaded code to
provide extremely high performance. This architecture can
be applied in many, many problem areas, and, where it can
be reasonably applied, can provide double or triple the
performance of contemporary code.
For example, the article presents three applications of the
- a word/line count utility, with performance roughly
double that of the typical "C" implmentation,
- a basic regular expression search utility, called
Grouse Grep. This utility does a better job of
handling simple string searches, case-insensitive
matching and class matching than any other
utility I've seen. Some iterative searches,
such as x.*y.*z, run in nearly linear time due
to inter-table optimisations. While the utility
provides state-of-the-art performance in some
areas for nondeterministic FSAs, it cannot
always outperform deterministic FSAs. Sadly,
the demonstration version included in the
article cannot handle more than basic RE
- A very fast static Huffman decoder, which
can decompress a Huffman-decoded bitstream
for just 14 cycles per output byte on a
Pentium. This time does not include the time
required to set up the decoding tables (this
time is quite modest, typically of the order of
300k cycles). On a Pentium 150, that's 11
million output bytes a second (of course, this
ignores the effort needed to read in the bitstream
and to deal with the decompressed output).
The article presents code based on Intel 80x86 assembly.
Even worse, the applications are written to run in real mode
under MS-DOS. Don't let these limitations fool you -- the
code is written almost entirely in ANSI C and is intended to
be easily portable to Unix, VMS etc. The assembly technique
works very nicely on the Intel chip, but is completely general
and can be ported (with varying degrees of efficiency, sigh)
to other architectures.
I've published complete source code along with the articles.
The code can be run entirely as a C program, although the
performance is far less than using the assembly versions
of the FSA engine.
You can find the sources both on Dr. Dobbs' web site
and on my company's web site, via ftp
(files ggrep101.zip, ghuff100.zip and wc102.zip):
There are many, many other areas where this technique
could be applied. For example, it may provide a faster
lexical analyser, and might facilitate very fast
table-driven compilers. Virus scanners might be able to
do a better job of scanning images. DNA sequence searches
might be able to search 4 bases packed in a byte for a
cost of around 20 cycles per byte.
Compiler writers will probably be able integrate my ideas
into their treatment of multi-way branch cases (e.g.
switch ()...), leading to a substantial improvement in
performance for a modest increase in code size.
The key to the architecture is that you can define your
own state tables, and define 20 or 30 actions that name
how you react to incoming bytes in each state. This
allows extremely sophisticated classification, merely by
table lookup and code threading, of the incoming bytes.
Best of all, the code is not merely freeware: I've placed
any ideas of the architecture that I've invented into the
public domain. In return, I would appreciate return
donations of improved code and/or new applications
where you extend my code or find applications in
new areas. And if you really like my ideas and
would like to see more public-domain software from
me (there's lots more that I could do!), consider sending
a donation to my company, Grouse Software.
Brenton Hoff (behoffski) | Software Engineer, Grouse Software Pty Ltd
behoffski at grouse.com.au | http://www.grouse.com.au/
More information about the Bioforum