>>> Back to Terrier wikipage

Contributing to Terrier

As an open source platform, we are very keen to have patches, and we very much welcome extensions and improvements to Terrier. Contributions are duly acknowledged - see [WWW] http://terrier.org/docs/current/whats_new.html

Getting the Source

Terrier comes with the source code. You can download from [WWW] http://terrier.org/download/

Compiling updates to Terrier

Terrier uses Maven as a build system. You can compile Terrier using using the package goal of Maven as follows:

mvn package

This will also check that you haven't broken any existing functionality in Terrier, you can check that all of the unit tests still pass:

Contributing to the Community

After making changes, you can create a [WWW] patch compared to the original downloaded version, using the unix diff command. Generally, this takes the form of a command line invocation, as follows:

diff -urN terrier-4.1-original/src terrier-4.1-updated/src > myissue.patch

You can make an issue on the [WWW] issue tracker when you have a patch. First, you need to create yourself an account on the Terrier issue tracker (This is a one-off procedure). Once you log-in, you can create a new issue through the Create Issue functionality (top right). For example, if you are contributing a new feature for Terrier, choose New Feature as type of issue, and attach your contributed code to the issue. We will then review it, and eventually integrate it to the next release.

Often, if you make the issue (e.g. if you are instead reporting a bug) before you make the change, we can discuss with you the best way to make the change to Terrier. When your patch is ready, attach it to the issue, where we will review it, and probably ask for changes. In general, small incremental patches are much easier for us to integrate than large changes.

If your provide a patch that changes the normal indexing or retrieval path, we may ask that you to provide empirical before and after tests, in either or both of efficiency and effectiveness. For example, if you change the default tokenisation, does this markedly decrease the time to index a standard TREC test collection, and does it negatively impact on the MAP for a set of queries and corresponding relevance assessments. You can refer to some known performances of Terrier in [WWW] http://terrier.org/docs/current/trec_examples.html.

last edited 2015-12-04 16:44:36 by CraigMacdonald