Xapian's Indexes

Xapian's creators go into little discussion of its index maintenance techniques. The inverted index is, aparently, implemented as a B-tree, with different posting lists broken into chunks, each chunk being indexed by its term and the the doc id of the first document of the chunk. How this impacts on performance, either in indexing or querying, I haven't been able to ascertain; nor does the approach seem to exist in the academic literature. Perhaps I have misunderstood their approach. A journey into the source code may be required to shed some more light on the issue.

The index itself, implemented as a backend known as Quartz, is described at * [WWW] http://www.xapian.org/docs/quartzdesign.html. A brief summary follows.


Quartz encodes all the index information as a series of tables: one for posting lists, user-defined data associated with each document etc. These tables consist of a set of key-tag pairs. Items may be accessed randomly by specifying a key and reading the item pointed to, or in sorted order by creating a cursor pointing to a particular item.

Tables are implemented as B-trees with The Btree class defining the standard interface to a table. Changes are made to the Btree by calling add() and del(), but they will not be seen by readers until commit() is called. Alternatively, calling cancel() will abandon changes. This allows atomic transactions to be implemented.

There are five tables comprising a quartz database. Record which the data associated with each document. Value, which stores a set of values for each document (statistics?). Termlist, the list of terms appearing in each document. Positionlist which for each valid document and term pair, stores the locations of the term within the document. Postlist, the posting lists for each term. Document ids are compressed using a byte-wise huffman encoding. Term names are compressed by storing differences between consecutive terms. A newer version uses interpolative coding to store the term positional information, (perhaps from Alistair Moffat, Binary Interpolative Coding for Effective Index Compression?).

Posting Lists and B-trees

Posting lists can grow to be very large. An entire posting list should not, therefore, be read into memory at once. posting lists are broken down into chunks, each the right size to be stored in a single B-tree block (a disk page, assumedly), and hence can be accessed with minimal disk latency.

The first chunk of the posting list has as its key the term whose posting list it is. Subsequents chunks are keyed by the first document id they contain as well, so sequential and random access of the posting list is possible.

The xapian b-tree implementation differs from the standard only in its revision scheme, which allows modifications to be applied atomically whilst other processes are reading the database. This scheme involves copying each block in the tree which is involved in a modification, rather than modifying it in place, so that a complete new tree structure is built up whilst the old structure is unmodified (although this new structure will typically share a large number of blocks with the old structure). The modifications can then be atomically applied by writing the new root block and making it active.

last edited 2006-06-19 10:06:19 by JohnKane