Terrier Querying Language
TODO to this page
Add
Phrasal querying
Proximity querying
The current state of play
1.0 Final
A query consists of many term_definitions and control_definitions.
Query Q = (term_definition | control_definition | phrase_definition) +
Each term_definition is a term prefixed by an optional boolean requirement or + or -
Term_definition = (+|-|)Term
* Phrase_definition
phrase_definition = " term term+ "
Each control_definition is a known fieldname ( prefixed by an optional boolean requirement or + or - ) and followed by a term OR a control_name followed by a control value.
Control-definition:
{+|-|}field_name:term |
control_name:control_value
Each term is a sequence of alphanumeric characters, followed by an optional boost value
Term = ([A-Za-z0-9]+)(^[0-9]+)?
Each control name is a sequence of characters
Control_name = ([A-Za-z0-9]+)
Each control value is anything except a space
Control_value = (.+)
Query Language Semantics
If a query term does not occur in combination with any phrase, field, or requirement constructs, then it is not required to appear in the retrieved documents. On the other hand if a query term appears in a phrase, field, or requirement, then it is required to exist in the retrieved documents.
The query: t1 t2 t3 title:t4 should be interpreted as: (t1 OR t2 OR t3) AND title:t4. In other words, the query should retrieve documents that contain term t4 in the title and at least one of the terms t1, t2, t3.
The query: t1 t2 +t3 +field:t4 -field:t3 should be interpreted as: (t1 OR t2) AND t3 AND field:t4 AND (t3 does not appear in field). This last query should retrieve documents in which term t4 appears in the field, term t3 does not appear in the field and at least one of the terms t1, t2 appears.
The query: t1 t2 field:(t2 t3) should be interpeted as: (t1 OR t2) AND field:t2 AND field:t3 = (t1 AND field:t2 AND field:t3) OR (field:t2 AND field:t3). In this case the requirement that term t2 should appear in the field overrides the unqualified occurrence of term t2 in the query.
Terrier 1.1+
Arbitary boolean expressions
Could have
Wildcards, fuzzy etc