TREC (Text REtrieval Conference) is a yearly forum/conference, which aims at the evaluation of large-scale InformationRetrieval systems and approaches. Over 90 Universities and Companies participate in TREC every year. It is organised by NIST, USA. TREC also aims to speed up the transfer of technology from research labs into commercial products. The Conference was first held in 1992.
TREC provides a set of text datasets, called test collections, against which an information retrieval system can be evaluated. TREC runs various tracks each having an associated test collections as described in http://trec.nist.gov/data.html.
TREC Test Web Collections - WT2G, WT10G, DOGTOV, and DOTGOV2 collections. If you're experimenting with InformationRetrieval systems in a Web context and/or if you are interested in large-scale IR evaluation, then these crawls are really a necessity. As queries and relevance assessments are available from TREC for these collections, you can use these to tune/evaluate your system or approach.