A definition of Natural Language Processing (NLP), as well as interesting links to NLP are located at the following address:
NLP may be used in Information Retrieval in various ways.
Full NLP interprets and stores meaning at all stages and at all levels for both the query and the document. The choice of how much to add and where is based on practical considerations, such as how computationally expensive the additional processing will be. Will it slow down the processing of queries unacceptably? How much overhead does it add to document processing? Are the retrieved results so much better that it is worth the trade-off?
Most IR systems use NLP on the lower levels of understanding, and often this is for query interpretation only. For instance, most Web search engines can automatically stem query words for singular and plural forms. Some, like Infoseek and Ask Jeeves have added the ability to interpret some syntax by parsing the query sentences or phrases. They do not apply this technique to document storage.
On the semantic level, the systems that employ NLP methods are relatively rare, particularly with regards to both document processing and query processing. ConQuest, now part of Excalibur, incorporates an extensive "lexicon" or dictionary that is implemented as a semantic network. These stored meanings are used as a "knowledge base" so that synonyms can be retrieved even if they are not specifically requested. InQuery parses sentences, stems words, and recognizes proper nouns and concepts based on term co-occurrence. DR-LINK from MNIS performs full document and query processing on all levels of language understanding including the discourse and pragmatic levels, although not to the extent of adding a full Cyc knowledge base.
As computers increase their processing speeds and new approaches optimize the process for adding documents and matching queries, these questions might become moot, and lead to scepticism regarding the overall usability of NLP to Information Retrieval systems.


