Class TrecLATimesParser
- java.lang.Object
-
- org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
-
- org.apache.lucene.benchmark.byTask.feeds.TrecLATimesParser
-
public class TrecLATimesParser extends TrecDocParser
Parser for the FT docs in trec disks 4+5 collection format
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
TrecDocParser.ParsePathType
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
DEFAULT_PATH_TYPE
-
-
Constructor Summary
Constructors Constructor Description TrecLATimesParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DocDataparse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType)parse the text prepared in docBuf into a result DocData, no synchronization is required.-
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
extract, pathType, stripTags, stripTags
-
-
-
-
Method Detail
-
parse
public DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType) throws IOException
Description copied from class:TrecDocParserparse the text prepared in docBuf into a result DocData, no synchronization is required.- Specified by:
parsein classTrecDocParser- Parameters:
docData- reusable resultname- name that should be set to the resulttrecSrc- calling trec content sourcedocBuf- text to parsepathType- type of parsed file, or null if unknown - may be used by parsers to alter their behavior according to the file path type.- Throws:
IOException
-
-