ResultSet objects are where the results of a query are stored. Each ResultSet object contains several arrays:
document scores : the score assigned by the weighting model to each document
document ids : the internal document ids
document occurrences : a bitset for each document marking the term occurrences of that document. (ie 5 = 111 => all three terms exist in this document). The bitset is represented with a 16 bit short integer. Therefore, the term occurrences are only saved for the first 16 query terms.
Documents are initially scored in a CollectionResultSet object. A CollectionResultSet object contains an array place for each document in the Collection. After the initial scoring of documents, DocumentScoreModifiers may alter the score of individual documents (eg boosting the score of a document, or removing documents that do not have all required terms. After applying each DocumentScoreModifier, the ResultSet is sorted.
Once the DSMs have finished, the resultset is handed back to the manager, and it is cropped to a QueryResultSet, which only contains documents with non-zero scores. Then, the manager applies the postprocesses and postfilters to modify the resultset. The contract here is that the postprocesses and postfilters MUST NOT alter the scores, but they can only specify if a given document is IN or OUT of the ResultSet. The ResultSet will NOT be re-sorted. After these are finished, a new ResultSet is created which is missing the filtered out documents.
See Also:
ResultSet Javadoc
CollectionResultSet Javadoc
QueryResultSet Javadoc