The BLOGS08 test collection

Modified: 6th April 2008

BLOGS08 is a TREC test collection, created and distributed by the University of Glasgow.



Total Number of Feeds:1,303,520
Average feeds collected every day:38,557
Uncompressed Size:808GB
Compressed Size:193GB

Permalink Documents:

Total number of permalink documents:28,488,766
Average documents every day:73,048
Uncompressed Size:1445GB
Compressed Size:245GB

Homepage Documents:

Total number of homepages: 1,011,733
Uncompressed Size:56GB
Compressed Size:12GB


Total size of collection is 453GB

Distribution information: