The hypergeometric model assumes that the document is a sample, and the population is from the collection. It is called DLH, in our DivergenceFromRandomness terminology. The DLH model is a generalistion of the hypergeometric model in a binomial case.

This model is parameter-free and includes an implicit TermFrequencyNormalisation component.

The formula of the DLH model can be found in FormulasOfDFRModels.

Terrier, version 1.0.2 includes a DLH model implementation, in the class [WWW] DLH

last edited 2007-02-17 12:30:49 by IadhOunis