Class WeightedSpanTermExtractor
- java.lang.Object
-
- org.apache.lucene.search.highlight.WeightedSpanTermExtractor
-
public class WeightedSpanTermExtractor extends Object
Class used to extractWeightedSpanTerms from aQuerybased on whetherTerms from theQueryare contained in a suppliedTokenStream.In order to support additional, by default unsupported queries, subclasses can override
extract(Query, float, Map)for extracting wrapped or delegate queries andextractUnknownQuery(Query, Map)to process custom leaf queries:WeightedSpanTermExtractor extractor = new WeightedSpanTermExtractor() { protected void extract(Query query, float boost, Map<String, WeightedSpanTerm>terms) throws IOException { if (query instanceof QueryWrapper) { extract(((QueryWrapper)query).getQuery(), boost, terms); } else { super.extract(query, boost, terms); } } protected void extractUnknownQuery(Query query, Map<String, WeightedSpanTerm> terms) throws IOException { if (query instanceOf CustomTermQuery) { Term term = ((CustomTermQuery) query).getTerm(); terms.put(term.field(), new WeightedSpanTerm(1, term.text())); } } }; }
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static classWeightedSpanTermExtractor.PositionCheckingMap<K>This class makes sure that if both position sensitive and insensitive versions of the same term are added, the position insensitive one wins.
-
Constructor Summary
Constructors Constructor Description WeightedSpanTermExtractor()WeightedSpanTermExtractor(String defaultField)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidcollectSpanQueryFields(SpanQuery spanQuery, Set<String> fieldNames)protected voidextract(Query query, float boost, Map<String,WeightedSpanTerm> terms)protected voidextractUnknownQuery(Query query, Map<String,WeightedSpanTerm> terms)protected voidextractWeightedSpanTerms(Map<String,WeightedSpanTerm> terms, SpanQuery spanQuery, float boost)protected voidextractWeightedTerms(Map<String,WeightedSpanTerm> terms, Query query, float boost)protected booleanfieldNameComparator(String fieldNameToCheck)Necessary to implement matches for queries againstdefaultFieldbooleangetExpandMultiTermQuery()protected LeafReaderContextgetLeafContext()TokenStreamgetTokenStream()Returns the tokenStream which may have been wrapped in a CachingTokenFilter.Map<String,WeightedSpanTerm>getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream)Creates a Map ofWeightedSpanTermsfrom the givenQueryandTokenStream.Map<String,WeightedSpanTerm>getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream, String fieldName)Creates a Map ofWeightedSpanTermsfrom the givenQueryandTokenStream.Map<String,WeightedSpanTerm>getWeightedSpanTermsWithScores(Query query, float boost, TokenStream tokenStream, String fieldName, IndexReader reader)Creates a Map ofWeightedSpanTermsfrom the givenQueryandTokenStream.booleanisCachedTokenStream()protected booleanisQueryUnsupported(Class<? extends Query> clazz)booleanisUsePayloads()protected booleanmustRewriteQuery(SpanQuery spanQuery)voidsetExpandMultiTermQuery(boolean expandMultiTermQuery)protected voidsetMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)A threshold of number of characters to analyze.voidsetUsePayloads(boolean usePayloads)voidsetWrapIfNotCachingTokenFilter(boolean wrap)By default,TokenStreams that are not of the typeCachingTokenFilterare wrapped in aCachingTokenFilterto ensure an efficient reset - if you are already using a different cachingTokenStreamimpl and you don't want it to be wrapped, set this to false.
-
-
-
Constructor Detail
-
WeightedSpanTermExtractor
public WeightedSpanTermExtractor()
-
WeightedSpanTermExtractor
public WeightedSpanTermExtractor(String defaultField)
-
-
Method Detail
-
extract
protected void extract(Query query, float boost, Map<String,WeightedSpanTerm> terms) throws IOException
- Parameters:
query- Query to extract Terms fromterms- Map to place created WeightedSpanTerms in- Throws:
IOException- If there is a low-level I/O error
-
extractUnknownQuery
protected void extractUnknownQuery(Query query, Map<String,WeightedSpanTerm> terms) throws IOException
- Throws:
IOException
-
extractWeightedSpanTerms
protected void extractWeightedSpanTerms(Map<String,WeightedSpanTerm> terms, SpanQuery spanQuery, float boost) throws IOException
- Parameters:
terms- Map to place created WeightedSpanTerms inspanQuery- SpanQuery to extract Terms from- Throws:
IOException- If there is a low-level I/O error
-
extractWeightedTerms
protected void extractWeightedTerms(Map<String,WeightedSpanTerm> terms, Query query, float boost) throws IOException
- Parameters:
terms- Map to place created WeightedSpanTerms inquery- Query to extract Terms from- Throws:
IOException- If there is a low-level I/O error
-
fieldNameComparator
protected boolean fieldNameComparator(String fieldNameToCheck)
Necessary to implement matches for queries againstdefaultField
-
getLeafContext
protected LeafReaderContext getLeafContext() throws IOException
- Throws:
IOException
-
getWeightedSpanTerms
public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream) throws IOException
Creates a Map ofWeightedSpanTermsfrom the givenQueryandTokenStream.- Parameters:
query- that caused hittokenStream- of text to be highlighted- Returns:
- Map containing WeightedSpanTerms
- Throws:
IOException- If there is a low-level I/O error
-
getWeightedSpanTerms
public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream, String fieldName) throws IOException
Creates a Map ofWeightedSpanTermsfrom the givenQueryandTokenStream.- Parameters:
query- that caused hittokenStream- of text to be highlightedfieldName- restricts Term's used based on field name- Returns:
- Map containing WeightedSpanTerms
- Throws:
IOException- If there is a low-level I/O error
-
getWeightedSpanTermsWithScores
public Map<String,WeightedSpanTerm> getWeightedSpanTermsWithScores(Query query, float boost, TokenStream tokenStream, String fieldName, IndexReader reader) throws IOException
Creates a Map ofWeightedSpanTermsfrom the givenQueryandTokenStream. Uses a suppliedIndexReaderto properly weight terms (for gradient highlighting).- Parameters:
query- that caused hittokenStream- of text to be highlightedfieldName- restricts Term's used based on field namereader- to use for scoring- Returns:
- Map of WeightedSpanTerms with quasi tf/idf scores
- Throws:
IOException- If there is a low-level I/O error
-
collectSpanQueryFields
protected void collectSpanQueryFields(SpanQuery spanQuery, Set<String> fieldNames)
-
mustRewriteQuery
protected boolean mustRewriteQuery(SpanQuery spanQuery)
-
getExpandMultiTermQuery
public boolean getExpandMultiTermQuery()
-
setExpandMultiTermQuery
public void setExpandMultiTermQuery(boolean expandMultiTermQuery)
-
isUsePayloads
public boolean isUsePayloads()
-
setUsePayloads
public void setUsePayloads(boolean usePayloads)
-
isCachedTokenStream
public boolean isCachedTokenStream()
-
getTokenStream
public TokenStream getTokenStream()
Returns the tokenStream which may have been wrapped in a CachingTokenFilter. getWeightedSpanTerms* sets the tokenStream, so don't call this before.
-
setWrapIfNotCachingTokenFilter
public void setWrapIfNotCachingTokenFilter(boolean wrap)
By default,TokenStreams that are not of the typeCachingTokenFilterare wrapped in aCachingTokenFilterto ensure an efficient reset - if you are already using a different cachingTokenStreamimpl and you don't want it to be wrapped, set this to false. This setting is ignored when a term vector based TokenStream is supplied, since it can be reset efficiently.
-
setMaxDocCharsToAnalyze
protected final void setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)
A threshold of number of characters to analyze. When a TokenStream based on term vectors with offsets and positions are supplied, this setting does not apply.
-
-