Terrier/Disks4&5

TREC Disk 4 & 5

TREC Disks 4 & 5 are the main adhoc TREC test collections that followed Disks 1&2.

Indexing properties:

#skip indexing some tags for these corpora
TrecDocTags.process=TEXT,H3,DOCTITLE,HEADLINE,TTL

When indexing, we do not typically include the Congressional Record when indexing. See Query performance prediction, B.H He & I.Ounis, Information Systems 31(7), pp585--594, 2006. [WWW] http://portal.acm.org/citation.cfm?id=1226381