Package org.apache.lucene.document
Document for indexing and
searching.
The document package provides the user level logical representation of content to be indexed
and searched. The package also provides utilities for working with Documents and IndexableFields.
Document and IndexableField
A Document is a collection of IndexableFields. A IndexableField is a
logical representation of a user's content that needs to be indexed or stored. IndexableFields have a number of properties that tell Lucene how to
treat the content (like indexed, tokenized, stored, etc.) See the Field implementation of IndexableField for specifics on these properties.
Note: it is common to refer to Documents having Fields, even though technically they have IndexableFields.
Working with Documents
First and foremost, a Document is something created by the
user application. It is your job to create Documents based on the content of the files you are
working with in your application (Word, txt, PDF, Excel or any other format.) How this is done is
completely up to you. That being said, there are many tools available in other projects that can
make the process of taking a file and converting it into a Lucene Document.
The DateTools is a utility class to make dates and times
searchable. IntPoint, LongPoint, FloatPoint and DoublePoint enable indexing of numeric values (and also dates) for
fast range queries using PointRangeQuery
-
Class Summary Class Description BinaryDocValuesField Field that stores a per-documentBytesRefvalue.BinaryPoint An indexed binary field for fast range filters.BinaryRangeDocValues A binary representation of a range that wraps a BinaryDocValues fieldDateTools Provides support for converting dates to strings and vice-versa.Document Documents are the unit of indexing and search.DocumentStoredFieldVisitor AStoredFieldVisitorthat creates aDocumentfrom stored fields.DoubleDocValuesField Syntactic sugar for encoding doubles as NumericDocValues viaDouble.doubleToRawLongBits(double).DoubleField Field that stores a per-documentdoublevalue for scoring, sorting or value retrieval and index the field for fast range filters.DoublePoint An indexeddoublefield for fast range filters.DoubleRange An indexed Double Range field.DoubleRangeDocValuesField DocValues field for DoubleRange.FeatureField Fieldthat can be used to store static scoring factors into documents.Field Expert: directly create a field for a document.FieldType Describes the properties of a field.FloatDocValuesField Syntactic sugar for encoding floats as NumericDocValues viaFloat.floatToRawIntBits(float).FloatField Field that stores a per-documentfloatvalue for scoring, sorting or value retrieval and index the field for fast range filters.FloatPoint An indexedfloatfield for fast range filters.FloatRange An indexed Float Range field.FloatRangeDocValuesField DocValues field for FloatRange.InetAddressPoint An indexed 128-bitInetAddressfield.InetAddressRange An indexed InetAddress Range FieldIntField Field that stores a per-documentintvalue for scoring, sorting or value retrieval and index the field for fast range filters.IntPoint An indexedintfield for fast range filters.IntRange An indexed Integer Range field.IntRangeDocValuesField DocValues field for IntRange.KeywordField Field that indexes a per-document String orBytesRefinto an inverted index for fast filtering, stores values in a columnar fashion usingDocValuesType.SORTED_SETdoc values for sorting and faceting, and optionally stores values as stored fields for top-hits retrieval.KnnByteVectorField A field that contains a single byte numeric vector (or none) for each document.KnnFloatVectorField A field that contains a single floating-point numeric vector (or none) for each document.KnnVectorField Deprecated. useKnnFloatVectorFieldinsteadLatLonDocValuesField An per-document location field.LatLonPoint An indexed location field.LatLonShape An geo shape utility class for indexing and searching gis geometries whose vertices are latitude, longitude values (in decimal degrees).LatLonShapeDocValues A concrete implementation ofShapeDocValuesfor storing binary doc value representation ofLatLonShapegeometries in aLatLonShapeDocValuesFieldLatLonShapeDocValuesField Concrete implementation of aShapeDocValuesFieldfor geographic geometries.LongField Field that stores a per-documentlongvalue for scoring, sorting or value retrieval and index the field for fast range filters.LongPoint An indexedlongfield for fast range filters.LongRange An indexed Long Range field.LongRangeDocValuesField DocValues field for LongRange.NumericDocValuesField Field that stores a per-documentlongvalue for scoring, sorting or value retrieval.RangeFieldQuery Query class for searchingRangeFieldtypes by a definedPointValues.Relation.ShapeDocValuesField A doc values field forLatLonShapeandXYShapethat usesShapeDocValuesas the underlying binary doc value format.ShapeField A base shape utility class used for both LatLon (spherical) and XY (cartesian) shape fields.ShapeField.DecodedTriangle Represents a encoded triangle usingShapeField.decodeTriangle(byte[], DecodedTriangle).ShapeField.Triangle polygons are decomposed into tessellated triangles usingTessellatorthese triangles are encoded and inserted as separate indexed POINT fieldsSortedDocValuesField Field that stores a per-documentBytesRefvalue, indexed for sorting.SortedNumericDocValuesField Field that stores a per-documentlongvalues for scoring, sorting or value retrieval.SortedSetDocValuesField Field that stores a set of per-documentBytesRefvalues, indexed for faceting,grouping,joining.StoredField A field whose value is stored so thatIndexSearcher.storedFields()andIndexReader.storedFields()will return the field and its value.StoredValue Abstraction around a stored value.StringField A field that is indexed but not tokenized: the entire String value is indexed as a single token.TextField A field that is indexed and tokenized, without term vectors.XYDocValuesField An per-document location field.XYDocValuesPointInGeometryQuery XYGeometry query forXYDocValuesField.XYPointField An indexed XY position field.XYShape A cartesian shape utility class for indexing and searching geometries whose vertices are unitless x, y values.XYShapeDocValues A concrete implementation ofShapeDocValuesfor storing binary doc value representation ofXYShapegeometries in aXYShapeDocValuesFieldXYShapeDocValuesField Concrete implementation of aShapeDocValuesFieldfor cartesian geometries. -
Enum Summary Enum Description DateTools.Resolution Specifies the time granularity.Field.Store Specifies whether and how a field should be stored.InvertableType Describes how anIndexableFieldshould be inverted for indexing terms and postings.RangeFieldQuery.QueryType Used byRangeFieldQueryto check how each internal or leaf node relates to the query.ShapeField.DecodedTriangle.TYPE type of triangleShapeField.QueryRelation Query Relation Types *StoredValue.Type Type of aStoredValue.