Class TestUtil
- java.lang.Object
-
- org.apache.lucene.tests.util.TestUtil
-
public final class TestUtil extends Object
General utility methods for Lucene unit tests.
-
-
Field Summary
Fields Modifier and Type Field Description static Comparator<CharSequence>STRING_CODEPOINT_COMPARATORA comparator that compares UTF-16 strings / char sequences according to Unicode code point order.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidaddIndexesSlowly(IndexWriter writer, DirectoryReader... readers)static CodecalwaysDocValuesFormat(DocValuesFormat format)Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.static CodecalwaysPostingsFormat(PostingsFormat format)Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.static booleananyFilesExceptWriteLock(Directory dir)static <T> voidassertAttributeReflection(AttributeImpl att, Map<String,T> reflectedValues)Checks some basic behaviour of an AttributeImplstatic voidassertConsistent(TopDocs expected, TopDocs actual)Assert that the givenTopDocshave the same top docs and consistent hit counts.static StringbytesRefToString(BytesRef br)For debugging: tries to include br.utf8ToString(), but if that fails (because it's not valid utf8, which is fine!), just use ordinary toString.static CharSequencebytesToCharSequence(BytesRef ref, Random random)static CheckIndex.StatuscheckIndex(Directory dir)This runs the CheckIndex tool on the index in.static CheckIndex.StatuscheckIndex(Directory dir, boolean doSlowChecks)static CheckIndex.StatuscheckIndex(Directory dir, boolean doSlowChecks, boolean failFast, boolean concurrent, ByteArrayOutputStream output)If failFast is true, then throw the first exception when index corruption is hit, instead of moving on to other fields/segments to look for any other corruption.static <T> voidcheckIterator(Iterator<T> iterator)Checks that the provided iterator is well-formed.static <T> voidcheckIterator(Iterator<T> iterator, long expectedSize, boolean allowNull)Checks that the provided iterator is well-formed.static voidcheckReader(IndexReader reader)This runs the CheckIndex tool on the Reader.static voidcheckReader(LeafReader reader, boolean doSlowChecks)static <T> voidcheckReadOnly(Collection<T> coll)Checks that the provided collection is read-only.static DocumentcloneDocument(Document doc1)static booleandisableVirusChecker(Directory in)Returns true if VirusCheckingFS is in use and was in fact already enabledstatic PostingsEnumdocs(Random random, IndexReader r, String field, BytesRef term, PostingsEnum reuse, int flags)static PostingsEnumdocs(Random random, TermsEnum termsEnum, PostingsEnum reuse, int flags)static booleandoubleUlpEquals(double x, double y, int maxUlps)Returns true if the arguments are equal or within the range of allowed error (inclusive).static voidenableVirusChecker(Directory in)static booleanfieldSupportsHugeBinaryDocValues(String field)static booleanfloatUlpEquals(float x, float y, short maxUlps)Returns true if the arguments are equal or within the range of allowed error (inclusive).static CodecgetDefaultCodec()Returns the actual default codec (e.g.static DocValuesFormatgetDefaultDocValuesFormat()Returns the actual default docvalues format (e.g.static KnnVectorsFormatgetDefaultKnnVectorsFormat()Returns the actual default vector format (e.g.static PostingsFormatgetDefaultPostingsFormat()Returns the actual default postings format (e.g.static PostingsFormatgetDefaultPostingsFormat(int minItemsPerBlock, int maxItemsPerBlock)Returns the actual default postings format (e.g.static StringgetDocValuesFormat(String field)static StringgetDocValuesFormat(Codec codec, String field)static StringgetPostingsFormat(String field)static StringgetPostingsFormat(Codec codec, String field)static PostingsFormatgetPostingsFormatWithOrds(Random r)Returns a random postings format that supports term ordinalsstatic booleanhasVirusChecker(Path path)static booleanhasVirusChecker(Directory dir)static booleanhasWindowsFS(Path path)static booleanhasWindowsFS(Directory dir)static BigIntegernextBigInteger(Random random, int maxBytes)Returns a randomish big integer with1 .. maxBytesstorage.static intnextInt(Random r, int start, int end)start and end are BOTH inclusivestatic longnextLong(Random r, long start, long end)start and end are BOTH inclusivestatic DirectoryramCopyOf(Directory dir)Returns a copy of the source directory, with file contents stored in RAM.static StringrandomAnalysisString(Random random, int maxLength, boolean simple)static BytesRefrandomBinaryTerm(Random r)Returns a random binary term.static BytesRefrandomBinaryTerm(Random r, int length)Returns a random binary with a given lengthstatic StringrandomFixedByteLengthUnicodeString(Random r, int length)Returns random string, with a given UTF-8 byte lengthstatic voidrandomFixedLengthUnicodeString(Random random, char[] chars, int offset, int length)Fills provided char[] with valid random unicode code unit sequence.static StringrandomHtmlishString(Random random, int numElements)static StringrandomlyRecaseCodePoints(Random random, String str)Randomly upcases, downcases, or leaves intact each code point in the given stringstatic PatternrandomPattern(Random random)Returns a valid (compiling) Pattern instance with random stuff inside.static StringrandomRealisticUnicodeString(Random r)Returns random string of length between 0-20 codepoints, all codepoints within the same unicode block.static StringrandomRealisticUnicodeString(Random r, int maxLength)Returns random string of length up to maxLength codepoints , all codepoints within the same unicode block.static StringrandomRealisticUnicodeString(Random r, int minLength, int maxLength)Returns random string of length between min and max codepoints, all codepoints within the same unicode block.static StringrandomRegexpishString(Random r)Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!static StringrandomRegexpishString(Random r, int maxLength)Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!static StringrandomSimpleString(Random r)static StringrandomSimpleString(Random r, int maxLength)static StringrandomSimpleString(Random r, int minLength, int maxLength)static StringrandomSimpleStringRange(Random r, char minChar, char maxChar, int maxLength)static StringrandomSubString(Random random, int wordLength, boolean simple)static StringrandomUnicodeString(Random r)Returns random string, including full unicode range.static StringrandomUnicodeString(Random r, int maxLength)Returns a random string up to a certain length.static voidreduceOpenFiles(IndexWriter w)just tries to configure things to keep the open file count lowishstatic voidshutdownExecutorService(ExecutorService ex)ShutdownExecutorServiceand wait for its.static CharSequencestringToCharSequence(String string, Random random)static voidsyncConcurrentMerges(IndexWriter writer)static voidsyncConcurrentMerges(MergeScheduler ms)static voidunzip(InputStream in, Path destDir)Convenience method unzipping zipName into destDir.
-
-
-
Field Detail
-
STRING_CODEPOINT_COMPARATOR
public static final Comparator<CharSequence> STRING_CODEPOINT_COMPARATOR
A comparator that compares UTF-16 strings / char sequences according to Unicode code point order. This can be used to verifyBytesReforder.Warning: This comparator is rather inefficient, because it converts the strings to a
int[]array on each invocation.
-
-
Method Detail
-
unzip
public static void unzip(InputStream in, Path destDir) throws IOException
Convenience method unzipping zipName into destDir. You must pass it a clean destDir.Closes the given InputStream after extracting!
- Throws:
IOException
-
checkIterator
public static <T> void checkIterator(Iterator<T> iterator, long expectedSize, boolean allowNull)
Checks that the provided iterator is well-formed.- is read-only: does not allow
remove - returns
expectedSizenumber of elements - does not return null elements, unless
allowNullis true. - throws NoSuchElementException if
nextis called afterhasNextreturns false.
- is read-only: does not allow
-
checkIterator
public static <T> void checkIterator(Iterator<T> iterator)
Checks that the provided iterator is well-formed.- is read-only: does not allow
remove - does not return null elements.
- throws NoSuchElementException if
nextis called afterhasNextreturns false.
- is read-only: does not allow
-
checkReadOnly
public static <T> void checkReadOnly(Collection<T> coll)
Checks that the provided collection is read-only.- See Also:
checkIterator(Iterator)
-
syncConcurrentMerges
public static void syncConcurrentMerges(IndexWriter writer)
-
syncConcurrentMerges
public static void syncConcurrentMerges(MergeScheduler ms)
-
checkIndex
public static CheckIndex.Status checkIndex(Directory dir) throws IOException
This runs the CheckIndex tool on the index in. If any issues are hit, a RuntimeException is thrown; else, true is returned.- Throws:
IOException
-
checkIndex
public static CheckIndex.Status checkIndex(Directory dir, boolean doSlowChecks) throws IOException
- Throws:
IOException
-
checkIndex
public static CheckIndex.Status checkIndex(Directory dir, boolean doSlowChecks, boolean failFast, boolean concurrent, ByteArrayOutputStream output) throws IOException
If failFast is true, then throw the first exception when index corruption is hit, instead of moving on to other fields/segments to look for any other corruption.- Throws:
IOException
-
checkReader
public static void checkReader(IndexReader reader) throws IOException
This runs the CheckIndex tool on the Reader. If any issues are hit, a RuntimeException is thrown- Throws:
IOException
-
checkReader
public static void checkReader(LeafReader reader, boolean doSlowChecks) throws IOException
- Throws:
IOException
-
floatUlpEquals
public static boolean floatUlpEquals(float x, float y, short maxUlps)Returns true if the arguments are equal or within the range of allowed error (inclusive). Returnsfalseif either of the arguments is NaN.Two float numbers are considered equal if there are
(maxUlps - 1)(or fewer) floating point numbers between them, i.e. two adjacent floating point numbers are considered equal.Adapted from org.apache.commons.numbers.core.Precision
github: https://github.com/apache/commons-numbers release 1.2
- Parameters:
x- first valuey- second valuemaxUlps-(maxUlps - 1)is the number of floating point values betweenxandy.- Returns:
trueif there are fewer thanmaxUlpsfloating point values betweenxandy.
-
doubleUlpEquals
public static boolean doubleUlpEquals(double x, double y, int maxUlps)Returns true if the arguments are equal or within the range of allowed error (inclusive). Returnsfalseif either of the arguments is NaN.Two double numbers are considered equal if there are
(maxUlps - 1)(or fewer) floating point numbers between them, i.e. two adjacent floating point numbers are considered equal.Adapted from org.apache.commons.numbers.core.Precision
github: https://github.com/apache/commons-numbers release 1.2
- Parameters:
x- first valuey- second valuemaxUlps-(maxUlps - 1)is the number of floating point values betweenxandy.- Returns:
trueif there are fewer thanmaxUlpsfloating point values betweenxandy.
-
nextInt
public static int nextInt(Random r, int start, int end)
start and end are BOTH inclusive
-
nextLong
public static long nextLong(Random r, long start, long end)
start and end are BOTH inclusive
-
nextBigInteger
public static BigInteger nextBigInteger(Random random, int maxBytes)
Returns a randomish big integer with1 .. maxBytesstorage.
-
randomSimpleStringRange
public static String randomSimpleStringRange(Random r, char minChar, char maxChar, int maxLength)
-
randomUnicodeString
public static String randomUnicodeString(Random r)
Returns random string, including full unicode range.
-
randomUnicodeString
public static String randomUnicodeString(Random r, int maxLength)
Returns a random string up to a certain length.
-
randomFixedLengthUnicodeString
public static void randomFixedLengthUnicodeString(Random random, char[] chars, int offset, int length)
Fills provided char[] with valid random unicode code unit sequence.
-
randomRegexpishString
public static String randomRegexpishString(Random r)
Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!
-
randomRegexpishString
public static String randomRegexpishString(Random r, int maxLength)
Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!Note: to avoid practically endless backtracking patterns we replace asterisk and plus operators with bounded repetitions. See LUCENE-4111 for more info.
- Parameters:
maxLength- A hint about maximum length of the regexpish string. It may be exceeded by a few characters.
-
randomlyRecaseCodePoints
public static String randomlyRecaseCodePoints(Random random, String str)
Randomly upcases, downcases, or leaves intact each code point in the given string
-
randomRealisticUnicodeString
public static String randomRealisticUnicodeString(Random r)
Returns random string of length between 0-20 codepoints, all codepoints within the same unicode block.
-
randomRealisticUnicodeString
public static String randomRealisticUnicodeString(Random r, int maxLength)
Returns random string of length up to maxLength codepoints , all codepoints within the same unicode block.
-
randomRealisticUnicodeString
public static String randomRealisticUnicodeString(Random r, int minLength, int maxLength)
Returns random string of length between min and max codepoints, all codepoints within the same unicode block.
-
randomFixedByteLengthUnicodeString
public static String randomFixedByteLengthUnicodeString(Random r, int length)
Returns random string, with a given UTF-8 byte length
-
randomBinaryTerm
public static BytesRef randomBinaryTerm(Random r, int length)
Returns a random binary with a given length
-
alwaysPostingsFormat
public static Codec alwaysPostingsFormat(PostingsFormat format)
Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.
-
alwaysDocValuesFormat
public static Codec alwaysDocValuesFormat(DocValuesFormat format)
Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.
-
getDefaultCodec
public static Codec getDefaultCodec()
Returns the actual default codec (e.g. LuceneMNCodec) for this version of Lucene. This may be different thanCodec.getDefault()because that is randomized.
-
getDefaultPostingsFormat
public static PostingsFormat getDefaultPostingsFormat()
Returns the actual default postings format (e.g. LuceneMNPostingsFormat for this version of Lucene.
-
getDefaultPostingsFormat
public static PostingsFormat getDefaultPostingsFormat(int minItemsPerBlock, int maxItemsPerBlock)
Returns the actual default postings format (e.g. LuceneMNPostingsFormat for this version of Lucene.- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
- this may disappear at any time
-
getPostingsFormatWithOrds
public static PostingsFormat getPostingsFormatWithOrds(Random r)
Returns a random postings format that supports term ordinals
-
getDefaultDocValuesFormat
public static DocValuesFormat getDefaultDocValuesFormat()
Returns the actual default docvalues format (e.g. LuceneMNDocValuesFormat for this version of Lucene.
-
fieldSupportsHugeBinaryDocValues
public static boolean fieldSupportsHugeBinaryDocValues(String field)
-
getDefaultKnnVectorsFormat
public static KnnVectorsFormat getDefaultKnnVectorsFormat()
Returns the actual default vector format (e.g. LuceneMNKnnVectorsFormat for this version of Lucene.
-
anyFilesExceptWriteLock
public static boolean anyFilesExceptWriteLock(Directory dir) throws IOException
- Throws:
IOException
-
addIndexesSlowly
public static void addIndexesSlowly(IndexWriter writer, DirectoryReader... readers) throws IOException
- Throws:
IOException
-
reduceOpenFiles
public static void reduceOpenFiles(IndexWriter w)
just tries to configure things to keep the open file count lowish
-
assertAttributeReflection
public static <T> void assertAttributeReflection(AttributeImpl att, Map<String,T> reflectedValues)
Checks some basic behaviour of an AttributeImpl- Parameters:
reflectedValues- contains a map with "AttributeClass#key" as values
-
assertConsistent
public static void assertConsistent(TopDocs expected, TopDocs actual)
Assert that the givenTopDocshave the same top docs and consistent hit counts.
-
docs
public static PostingsEnum docs(Random random, IndexReader r, String field, BytesRef term, PostingsEnum reuse, int flags) throws IOException
- Throws:
IOException
-
docs
public static PostingsEnum docs(Random random, TermsEnum termsEnum, PostingsEnum reuse, int flags) throws IOException
- Throws:
IOException
-
stringToCharSequence
public static CharSequence stringToCharSequence(String string, Random random)
-
bytesToCharSequence
public static CharSequence bytesToCharSequence(BytesRef ref, Random random)
-
shutdownExecutorService
public static void shutdownExecutorService(ExecutorService ex)
ShutdownExecutorServiceand wait for its.
-
randomPattern
public static Pattern randomPattern(Random random)
Returns a valid (compiling) Pattern instance with random stuff inside. Be careful when applying random patterns to longer strings as certain types of patterns may explode into exponential times in backtracking implementations (such as Java's).
-
randomAnalysisString
public static String randomAnalysisString(Random random, int maxLength, boolean simple)
-
randomSubString
public static String randomSubString(Random random, int wordLength, boolean simple)
-
bytesRefToString
public static String bytesRefToString(BytesRef br)
For debugging: tries to include br.utf8ToString(), but if that fails (because it's not valid utf8, which is fine!), just use ordinary toString.
-
ramCopyOf
public static Directory ramCopyOf(Directory dir) throws IOException
Returns a copy of the source directory, with file contents stored in RAM.- Throws:
IOException
-
hasWindowsFS
public static boolean hasWindowsFS(Directory dir)
-
hasWindowsFS
public static boolean hasWindowsFS(Path path)
-
hasVirusChecker
public static boolean hasVirusChecker(Directory dir)
-
hasVirusChecker
public static boolean hasVirusChecker(Path path)
-
disableVirusChecker
public static boolean disableVirusChecker(Directory in)
Returns true if VirusCheckingFS is in use and was in fact already enabled
-
enableVirusChecker
public static void enableVirusChecker(Directory in)
-
-