BM25

BM25 is one of the most established probabilistic term weighting models. The evolution of the 2-Poisson model as designed by Robertson, Van Rijsbergen and Porter has motivated the birth of a family of term-weighting forms called BMs (BM for Best Match). BM25 is the most successful formula of this family, which was introduced in:

Robertson, S., and Walker, S. Some simple approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In ACM-SIGIR Conference on Research and Development in InformationRetrieval (Dublin,June 1994), pp. 232–241.

Unlike Terrier's DivergenceFromRandomness models, the BM25 formula contains many parameters which need to be learned/tuned from relevance assessment.

While for many years, BM25 has been considered as one of the most effective matching functions for document retrieval, various recent TREC experiments show that a multitude of DivergenceFromRandomness models significantly outperform BM25 in various retrieval settings.

In his PhD thesis, GianniAmati has shown that BM25 can actually be derived from the DivergenceFromRandomness I(n)L2 model.

Terrier provides an implementation of BM25 in two different ways: a standard implementation called BM25, and a DFR-based one called BM25-DFR.

last edited 2005-02-07 10:52:32 by IadhOunis