Diff for "Terrier/IndexStructures"

Differences between revisions 1 and 2

Deletions are marked like this. Additions are marked like this.
Line 22: Line 22:
 * Direct``Index -> Block``Direct``Index
 * Inverted``Index -> Block``Inverted``Index
 * Direct``Index -> Block``Direct``Index. However, all functionality is exposed by parent class Bit``Posting``Index.
 * Inverted``Index -> Block``Inverted``Index. However, all functionality is exposed by parent class Bit``Posting``Index.

Terrier/IndexStructures

This page describes the structures recorded within a typical Terrier 3.x index.

Structure Name

Type

Implementation

Constructor Types

Constructor Values

Random Access Structures

meta

MetaIndex

o.t.s.CompressingMetaIndex

o.t.s.Index, String

index, structureName

lexicon

Lexicon<String>

o.t.s.FSOMapFileLexicon

String, o.t.s.Index

structureName, index

lexicon-keyfactory

FixedSizeWritableFactory

o.t.s.serialization.FixedSizeTextFactory

String

${max.term.length}

lexicon-valuefactory

FixedSizeWritableFactory

o.s.s.BasicLexiconEntry$Factory

document

DocumentIndex

o.t.s.FSADocumentIndex

o.t.s.Index, String

index, structureName

document-factory

FixedSizeWritableFactory

o.t.s.BasicDocumentIndexEntry$Factory

inverted

BitPostingIndex or InvertedIndex

o.t.s.InvertedIndex

o.t.s.Index, String, DocumentIndex,Class<? extends IterablePosting>

index, structureName, document, o.t.s.postings.BasicIterablePosting

direct

BitPostingIndex or DirectIndex

o.t.s.DirectIndex

o.t.s.Index, String

index, structureName

Stream Structures

meta-inputstream

Iterator<String[]>

o.t.s.CompressingMetaIndex$InputStream

o.t.s.Index, String

index, structureName

lexicon-inputstream

Iterator<LexiconEntry>

o.t.s.FSOMapFileLexicon

String, o.t.s.Index

structureName, index

document-inputstream

Iterator<Writable>

o.t.s.FSADocumentIndex$FSADocumentIndexIterator

o.t.s.Index, String

index, structureName

inverted-inputstream

BitPostingIndexInputStream

o.t.s.InvertedIndexInputStream

o.t.s.Index, String, Iterator<LexiconEntry>

index, structureName,lexicon-inputstream

Variations are possible according to configuration during indexing:

  • DirectIndex -> BlockDirectIndex. However, all functionality is exposed by parent class BitPostingIndex.

  • InvertedIndex -> BlockInvertedIndex. However, all functionality is exposed by parent class BitPostingIndex.

  • BasicDocumentIndexEntry$Factory -> FieldDocumentIndexEntry$Factory

  • FSADocumentIndex -> FSAFieldDocumentIndex NB: This change does not happen automatically, but is necessary for efficient field retrieval.

  • BasicLexiconEntry$Factory -> FieldLexiconEntry$Factory

  • BasicIterablePosting -> FieldIterablePosting, BlockIterablePosting, or BlockFieldIterablePosting

In addition to the reflected loading of index structures, the following properties are globally used in an index:

  • num.Terms

  • num.Tokens

  • num.Documents

  • num.Pointers

  • max.term.length

Various index structures record their own configuration properties in the index.

last edited 2011-04-02 16:07:24 by CraigMacdonald