DOTGOV is a Web crawl of the .gov US government websites, and was used by the TREC Web track 2002-2004. It can be obtained from the [WWW] University of Glasgow. The topics and qrels are available from the TREC website:

Indexing the DOTGOV collection is easy with Terrier. No terrier.properties are required to be altered from the default created by trec_setup.