Package org.apache.lucene.util.hnsw
Class HnswGraphBuilder
- java.lang.Object
-
- org.apache.lucene.util.hnsw.HnswGraphBuilder
-
- All Implemented Interfaces:
HnswBuilder
- Direct Known Subclasses:
InitializedHnswGraphBuilder
public class HnswGraphBuilder extends Object implements HnswBuilder
Builder for HNSW graph. SeeHnswGraphfor a gloss on the algorithm and the meaning of the hyper-parameters.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classHnswGraphBuilder.GraphBuilderKnnCollectorA restricted, specialized knnCollector that can be used when building a graph.
-
Field Summary
Fields Modifier and Type Field Description static intDEFAULT_BEAM_WIDTHDefault number of the size of the queue maintained while searching during a graph construction.static intDEFAULT_MAX_CONNDefault number of maximum connections per nodeprotected OnHeapHnswGraphhnswstatic StringHNSW_COMPONENTA name for the HNSW component for the info-stream *protected org.apache.lucene.util.hnsw.HnswLockhnswLockstatic longrandSeedRandom seed for level generation; public to expose for testing *
-
Constructor Summary
Constructors Modifier Constructor Description protectedHnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, int graphSize)Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.protectedHnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, OnHeapHnswGraph hnsw)protectedHnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, OnHeapHnswGraph hnsw, org.apache.lucene.util.hnsw.HnswLock hnswLock, HnswGraphSearcher graphSearcher)Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddGraphNode(int node)Inserts a doc with vector value to the graphprotected voidaddVectors(int minOrd, int maxOrd)add vectors in range [minOrd, maxOrd)OnHeapHnswGraphbuild(int maxOrd)Adds all nodes to the graph up to the providedmaxOrd.static HnswGraphBuildercreate(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed)static HnswGraphBuildercreate(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, int graphSize)OnHeapHnswGraphgetCompletedGraph()Once this method is called no further updates to the graph are accepted (addGraphNode will throw IllegalStateException).OnHeapHnswGraphgetGraph()voidsetInfoStream(InfoStream infoStream)Set info-stream to output debugging information
-
-
-
Field Detail
-
DEFAULT_MAX_CONN
public static final int DEFAULT_MAX_CONN
Default number of maximum connections per node- See Also:
- Constant Field Values
-
DEFAULT_BEAM_WIDTH
public static final int DEFAULT_BEAM_WIDTH
Default number of the size of the queue maintained while searching during a graph construction.- See Also:
- Constant Field Values
-
HNSW_COMPONENT
public static final String HNSW_COMPONENT
A name for the HNSW component for the info-stream *- See Also:
- Constant Field Values
-
randSeed
public static long randSeed
Random seed for level generation; public to expose for testing *
-
hnsw
protected final OnHeapHnswGraph hnsw
-
hnswLock
protected final org.apache.lucene.util.hnsw.HnswLock hnswLock
-
-
Constructor Detail
-
HnswGraphBuilder
protected HnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, int graphSize) throws IOException
Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.- Parameters:
scorerSupplier- a supplier to create vector scorer from ordinals.M- – graph fanout parameter used to calculate the maximum number of connections a node can have – M on upper layers, and M * 2 on the lowest level.beamWidth- the size of the beam search to use when finding nearest neighbors.seed- the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.graphSize- size of graph, if unknown, pass in -1- Throws:
IOException
-
HnswGraphBuilder
protected HnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, OnHeapHnswGraph hnsw) throws IOException
- Throws:
IOException
-
HnswGraphBuilder
protected HnswGraphBuilder(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, OnHeapHnswGraph hnsw, org.apache.lucene.util.hnsw.HnswLock hnswLock, HnswGraphSearcher graphSearcher) throws IOException
Reads all the vectors from vector values, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.- Parameters:
scorerSupplier- a supplier to create vector scorer from ordinals.M- – graph fanout parameter used to calculate the maximum number of connections a node can have – M on upper layers, and M * 2 on the lowest level.beamWidth- the size of the beam search to use when finding nearest neighbors.seed- the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.hnsw- the graph to build, can be previously initialized- Throws:
IOException
-
-
Method Detail
-
create
public static HnswGraphBuilder create(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed) throws IOException
- Throws:
IOException
-
create
public static HnswGraphBuilder create(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, int graphSize) throws IOException
- Throws:
IOException
-
build
public OnHeapHnswGraph build(int maxOrd) throws IOException
Description copied from interface:HnswBuilderAdds all nodes to the graph up to the providedmaxOrd.- Specified by:
buildin interfaceHnswBuilder- Parameters:
maxOrd- The maximum ordinal (excluded) of the nodes to be added.- Throws:
IOException
-
setInfoStream
public void setInfoStream(InfoStream infoStream)
Description copied from interface:HnswBuilderSet info-stream to output debugging information- Specified by:
setInfoStreamin interfaceHnswBuilder
-
getCompletedGraph
public OnHeapHnswGraph getCompletedGraph() throws IOException
Description copied from interface:HnswBuilderOnce this method is called no further updates to the graph are accepted (addGraphNode will throw IllegalStateException). Final modifications to the graph (eg patching up disconnected components, re-ordering node ids for better delta compression) may be triggered, so callers should expect this call to take some time.- Specified by:
getCompletedGraphin interfaceHnswBuilder- Throws:
IOException
-
getGraph
public OnHeapHnswGraph getGraph()
- Specified by:
getGraphin interfaceHnswBuilder
-
addVectors
protected void addVectors(int minOrd, int maxOrd) throws IOExceptionadd vectors in range [minOrd, maxOrd)- Throws:
IOException
-
addGraphNode
public void addGraphNode(int node) throws IOExceptionDescription copied from interface:HnswBuilderInserts a doc with vector value to the graph- Specified by:
addGraphNodein interfaceHnswBuilder- Throws:
IOException
-
-