Language modeling (LM) computes the probability of a query-term within a document by smoothing the maximum likelihood estimate (MLE) of the within-document term-frequency, tf/length(d), with the relative term-frequency within the collection, TF/FreqTotColl.

Smoothing can be obtained by either mixing these two probabilities or extracting the MLE from the compounding of the multinomial distribution with Dirichlet's Priors.

Terrier provides an implementation of Ponte & Croft's approach. Ponte & Croft's model is a different proposal for LM for combining the two probabilities.

last edited 2010-03-03 17:19:23 by CraigMacdonald