org.jscience.linguistics.search
Class BoyerMooreHorspool
java.lang.Object
org.jscience.linguistics.search.StringSearch
org.jscience.linguistics.search.BoyerMooreHorspool
- Direct Known Subclasses:
- BoyerMooreHorspoolRaita
public class BoyerMooreHorspool
- extends StringSearch
An implementation of Horspool's improved version of the Boyer-Moore
String searching algorithm. See "Practical fast searching in strings"
(appeared in Software - Practice & Experience, 10(6):501-506).
Unfortunately, there seems to be no on-line version of his paper.
This is the second fastest algorithm in this library for the
searchChars and searchString. Except for very
short patterns (< 5 characters), it is always faster than any other
algorithm except BoyerMooreHorspoolRaita and faster than
String.indexOf(java.lang.String) by more than 5 times for
patterns longer than 24 characters. It's searchBytes methods
are slightly faster than in the BoyerMooreHorspoolRaita algorithm.
This implementation is based on Ricardo
Baeza-Yates' implementation.
Preprocessing: O(m + ∑) time
Processing : O(mn) worst case
|
Method Summary |
java.lang.Object |
processBytes(byte[] pattern)
Returns a int array. |
java.lang.Object |
processChars(char[] pattern)
Returns a CharIntMap. |
int |
searchBytes(byte[] text,
int textStart,
int textEnd,
byte[] pattern,
java.lang.Object processed)
Returns the position in the text at which the pattern was found. |
int |
searchChars(char[] text,
int textStart,
int textEnd,
char[] pattern,
java.lang.Object processed)
Returns the index of the pattern in the text using the pre-processed
Object. |
| Methods inherited from class org.jscience.linguistics.search.StringSearch |
createCharIntMap, createCharIntMap, equals, hashCode, index, processString, searchBytes, searchBytes, searchBytes, searchBytes, searchBytes, searchChars, searchChars, searchChars, searchChars, searchChars, searchString, searchString, searchString, searchString, searchString, searchString, toString, toStringBuffer, usesNative, usesReflection |
| Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
BoyerMooreHorspool
public BoyerMooreHorspool()
- Constructor for BoyerMooreHorspool. Note that it is not required to
create multiple instances.
processBytes
public java.lang.Object processBytes(byte[] pattern)
- Returns a
int array.
- Specified by:
processBytes in class StringSearch
- Parameters:
pattern - the byte array containing the pattern, may
not be null
- Returns:
- an Object
- See Also:
StringSearch.processBytes(byte[])
processChars
public java.lang.Object processChars(char[] pattern)
- Returns a
CharIntMap.
- Specified by:
processChars in class StringSearch
- Parameters:
pattern - a char array containing the pattern, may not
be null
- Returns:
- an Object
- See Also:
StringSearch.processChars(char[])
searchBytes
public int searchBytes(byte[] text,
int textStart,
int textEnd,
byte[] pattern,
java.lang.Object processed)
- Description copied from class:
StringSearch
- Returns the position in the text at which the pattern was found. Returns
-1 if the pattern was not found.
- Specified by:
searchBytes in class StringSearch
- Parameters:
text - text the byte array containing the text, may
not be nulltextStart - at which position in the text the comparing should
starttextEnd - at which position in the text comparing should stoppattern - the pattern to search for, may not be nullprocessed - an Object as returned from StringSearch.processBytes(byte[]), may not be null
- Returns:
- the position in the text or -1 if the pattern was not found
- See Also:
StringSearch.searchBytes(byte[],
int,int,byte[],java.lang.Object)
searchChars
public int searchChars(char[] text,
int textStart,
int textEnd,
char[] pattern,
java.lang.Object processed)
- Description copied from class:
StringSearch
- Returns the index of the pattern in the text using the pre-processed
Object. Returns -1 if the pattern was not found.
- Specified by:
searchChars in class StringSearch
- Parameters:
text - the String containing the text, may not be nulltextStart - at which position in the text the comparing should
starttextEnd - at which position in the text comparing should stoppattern - the pattern to search for, may not be nullprocessed - an Object as returned from StringSearch.processChars(char[]) or StringSearch.processString(String), may
not be null
- Returns:
- the position in the text or -1 if the pattern was not found
- See Also:
StringSearch.searchChars(char[],
int,int,char[],Object)