IR linguistic utilities
Bits of IR language related utilities like stemmers, stop words lists, morphological taggers, etc.
- Porter.c
The well known Porter stemming algorithm written in C.
- Porter.p
Same again but this time in Pascal.
- porter.bas
Dan Wade kindly sent in this Visual Basic version (thanks!)
- Porter.java
In Java.
- Stop word list
- Language Technology Group
A large number of utilities from the LTG at the University of Edinburgh, including part of speech taggers, text tokenisation utilities, etc.
- Snowball
A langauge for stemming algorithms developed by Martin Porter
- Xerox Content Analysis group
A number of multilingual tools, such as morphological tools and grammatical taggers.