TerrierUKProject

TerrierUKProject

I'll begin with the Moffat research paper. The idea is to include an internal index in each inverted list, and also use the Golomb compression.

I'll procede by steps for this paper. This paper recommends two types of compression : the gamma and the Golomb one.

Step 1 : I I want to use the gamma one, already coded in terrier. I'll change the data in the inverted lists, modifying writing and reading of these data. I'll modify 3 files for this : invertedIndex.java, invertedIndexBuilder.java and structureMerger.java.

Step 2 : I want to implement the Golomb compression and use it. I'll need to modify the same files and also bitFile.java

Step 3 : The paper deal with cosine measure, so i'll probably implement it for matching

For the other paper, I'll need to create an other class, which is in charge of electing docs in order to do not the complete computation of all the collection.

last edited 2008-04-10 08:43:14 by CraigMacdonald