In this paper
we present a new string-pattern matching algorithm that partitions the
text into segments of the input pattern length and searches for pattern
occurrences using a simple hashing scheme. Unlike the well-known
Boyer-Moore algorithm and its variants, our algorithm does not compute
variable shift lengths, thus providing a conceptually simpler way
to search for patterns. Empirical evaluation shows that our algorithm
runs significantly faster than Sunday's and Horspool's extensions of
the Boyer-Moore algorithm. The notion of non-occurrence heuristic
used in our algorithm, together with a text partitioning scheme,
leads to a simplified scheme for searching for pattern occurrences,
thus yielding better runtime performance.
Sources:
The LaTeX version of the article;
this is a Unix file that contains the LaTeX source;
note that you must be a subscriber (institutional or individual) to access this file.
The Postscript
version or PDF
version of the article;
note that you must be a subscriber (institutional or individual) to access
these files.
The HTML version of the article;
note that you must be a subscriber (institutional or individual) to access
this file.
The software suite accompanying the article;
this is a small Unix tar file, which includes the source code, a Makefile,
and the test files used in the article.