Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenizerFactory
-
- org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
-
public class WhitespaceTokenizerFactory extends TokenizerFactory
Factory forWhitespaceTokenizer.<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/> </analyzer> </fieldType>Options:- rule: either "java" for
WhitespaceTokenizeror "unicode" forUnicodeWhitespaceTokenizer - maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "whitespace"
- rule: either "java" for
-
-
Field Summary
Fields Modifier and Type Field Description static StringNAMESPI namestatic StringRULE_JAVAstatic StringRULE_UNICODE-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description WhitespaceTokenizerFactory()Default ctor for compatibility with SPIWhitespaceTokenizerFactory(Map<String,String> args)Creates a new WhitespaceTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizercreate(AttributeFactory factory)-
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
RULE_JAVA
public static final String RULE_JAVA
- See Also:
- Constant Field Values
-
RULE_UNICODE
public static final String RULE_UNICODE
- See Also:
- Constant Field Values
-
-
Method Detail
-
create
public Tokenizer create(AttributeFactory factory)
- Specified by:
createin classTokenizerFactory
-
-