Diff for "TREC-BLOG/TREC2007"

Differences between revisions 3 and 4

Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:


This wiki page provides all details about the TREC Blog track campaign ran in 2007. It can be seen as the archives of the TREC 2007 Blog track. Details about the new TREC Blog track 2008 campaign can be found in ["TREC-BLOG"]

Line 4: Line 9:
Line 244: Line 248:
 * February 24, 2008: added TREC Blog track 2007 campaign wiki page to the archives

CONTENTS

This wiki page provides all details about the TREC Blog track campaign ran in 2007. It can be seen as the archives of the TREC 2007 Blog track. Details about the new TREC Blog track 2008 campaign can be found in TREC-BLOG

  1. TREC Blog Track 2007
  2. Opinion Retrieval Task
    1. Assessment
    2. Evaluation
    3. Polarity Subtask
    4. Submissions
  3. Blog Distillation (Feed Search)Task
    1. Operationality
    2. Topic Development Phase
    3. Topic Development System
    4. Evaluation
  4. Timeline
  5. History of Document
  6. Track Coordinators

TREC Blog Track 2007

In TREC 2006, we ran an opinion retrieval task. We ran again the opinion finding task in 2007, with a polarity subtask. Additionally, we proposed a new task, called the blog distillation (feed Search) task.

The Blog06 test collection was used in TREC Blog Track 2007.

Opinion Retrieval Task

The opinion retrieval task involves locating blog posts that express an opinion about a given target. It can be summarised as What do people think about <target>. It is a subjective task. The target can be a "traditional" named entity -- a name of a person, location, or organization -- but also a concept (such as a type of technology), a product name, or an event. Note that the topic of the post does not necessarily have to be the target, but an opinion about the target must be present in the post or one of the comments to the post.

For example, for the target "skype":

Excerpt from relevant, opinionated post (permalink [WWW] http://gigaom.com/2005/12/01/skype-20-eats-its-young/):

  • Skype 2.0 eats its young

    The elaborate press release and WSJ review while impressive don’t help mask the fact that, Skype is short on new ground breaking ideas. Personalization via avatars and ring-tones... big new idea? Not really. Phil Wolff over on Skype Journal puts it nicely when he writes, "If you’ve been using Skype, the Beta version of Skype 2.0 for Windows won’t give you a new Wow! experience." ...

Excerpt from unopinionated post (permalink [WWW] http://www.slashphone.com/115/3152.html):

  • Skype Launches Skype 2.0 Features Skype Video

    Skype released the beta version of Skype 2.0, the newest version of its software that allows anyone with an Internet connection to make free Internet calls. The software is designed for greater ease of use, integrated video calling, and ...

Assessment

We will use the same assessment procedure defined in 2006. The retrieval unit is documents from the permalink component of the Blog06 test collection. The content of a blog post is defined as the content of the post itself and the contents of all comments to the post: if the relevant content is in a comment, then the permalink is declared to be relevant. Note that blogs and non-blogs will be treated equally in this task. Our objective is to run again the opinion task with 50 new topics, which we will again ask NIST to select from query logs provided by commercial blog search engines (see TREC 2006 Blog track Overview paper for further details on the methodology).

The following scale will be used for the assessment:

 *[-1] i.e. Not judged.  The content of the post was not
    examined due to offensive URL or header (such documents do exist
    in the collection due to spam).  Although the content itself was not assessed,
    it is very likely, given the offensive header, that the post is
    irrelevant.

 *[0] i.e. Not relevant.  The post and its comments were
    examined, and does not contain any information about the target,
    or refers to it only in passing.

 *[1] i.e. Relevant.  The post or its comments contain
    information about the target, but do not express an opinion
    towards it.  To be assessed as ``Relevant", the information given
    about the target should be substantial enough to be included in a
    report compiled about this entity.

If the post or its comments are not only on target, but also contain an explicit expression of opinion or sentiment about the target, showing some personal attitude of the writer(s), then judge the document using the labels below.

 *[2] i.e. Relevant, negative opinions. The post contains an explicit expression of opinion or sentiment about the target, showing some personal attitude of the writer(s), and the opinion expressed is explicitly negative about, or against, the target.

 *[3] i.e. Relevant, mixed positive and negative opinions. Same as [2], but contains both positive and negative opinions.

 *[4] i.e. Relevant, positive opinion. Same as [2], but the opinion expressed is explicitly positive about, or supporting, the target.

Evaluation

Number of test targets will be 50. These will be selected by NIST from a larger commercial query log - using the methodology described in the TREC Blog track 2006 Overview paper.

Metrics will be precision/recall based, where the actual "most important metric" will be MAP.

Polarity Subtask

We propose to add a related subtask, namely a text classification-related task, requiring participants to determine the polarity (or orientation) of the opinions in the retrieved documents, namely whether the opinions are positive or negative. For training, participants could use last year’s 50 queries, with their associated relevance judgements - available from [WWW] http://trec.nist.gov/data/blog06.html. Indeed, during the assessment procedure in 2006, for each document in the pool, the NIST assessors have specified the polarity of the relevant documents: relevant positive opinion; relevant mixed positive and negative; relevant negative opinion.

Evaluation will probably be by Accuracy and the F measure.

Submissions

Opinion Task

For the opinion task, the usual trec_eval format will be used. The submission file should contain lines of the format

  topic Q0 docno rank sim runtag

where

  topic is the topic number
  Q0 is a literal "Q0"  (a historical field...)
  docno is the permalink document number (BLOG06-200.....-...)
  rank is the rank at which the system returned the document (1 .. n)
  sim is the system's similarity score
  runtag is the run's identifier string

You may submit up to six runs for the opinion finding task, which must include the following two compulsory runs:

  • An automatic run using the title field of the topics.

  • An automatic run, using the title field of the topics, with all opinion-finding features turned off. (I.e. a topic-relevance run). Please use an appropriate id for this run.

Aside from the required runs, we wholeheartedly encourage the submission of manual runs, which are invaluable in improving the quality of the collection. (An automatic run is one that involves no human interaction. In contrast, a manual run is one where (for example) you formulate queries, search manually, give relevance feedback, and/or rerank documents by hand.)

Polarity Subtask

Additionally, if you are participating in the polarity sub task, you should provide a separate file for each submitted run to the opinion finding task, that details the predicted polarity for each retrieved document in each query. This file should include the same documents in the same order as for the opinion finding task, but with an additional polarity label. You should submit a file for each run in the opinion-finding task.

Format:

  topic docno documentlabel

where

  topic is the topic number
  docno is the permalink document number (BLOG06-200.....-...)
  documentlabel is the system's prediction of polarity - one of: [2,3,4]

The documentlabel field states whether document is predicted to be negatively opinionated 2; mixed polarity 3; or positively opinionated 4.

Blog Distillation (Feed Search)Task

Blog search users often wish to identify blogs about a given topic, which they can subscribe to and read on a regular basis. This user task is most often manifested in two scenarios:

  • Filtering: The user subscribes to a repeating search in their RSS reader.

  • Distillation: The user searches for blogs with a recurring central interest, and then adds these to their RSS reader.

For TREC 2007, we are recommending that the TREC Blog track investigates the latter scenario – Blog Distillation. The Blog Distillation Task can be summarised as Find me a blog with a principle, recurring interest in X. For a given area X, systems should suggest feeds that are principally devoted to X over the timespan of the feed, and would be recommended to subscribe to as an interesting feed about the X (ie a user may be interested in adding it to their RSS reader).

This task is particularly interesting for the following reasons:

  • A similar (yet-different) task has been investigated in the Enterprise track (Expert Search) in a smaller setting of around 1000 candidate experts. For Blog Distillation, the Blogs06 corpus contains around 100k blogs, and a Web-like setting (with anchor text, linkage, spam, etc).

  • A Topic Distillation task was run in the Web track. In Topic Distillation, site relevance was required as (i) Is principally devoted to the topic, (ii) provides credible information on the topic, and (iii) is not part of a larger site also principally devoted to the topic.

While the definition of Blog Distillation as explained above is different, the idea is to provide the users with the key blogs about a given topic. Note that point (iii) is not applicable in a blog setting.

Operationality

This year, NIST cannot provides resources for a blog distillation task. Following the TREC Enterprise track, we suggest that the topics and assessments are provided by participating groups.

  • (24th June): Each participating group will initially provide 6 or 7 topics along with some relevant feeds.

  • (After submission): Relevant feeds will be pooled, and the groups which proposed topics will evaluate them.

Proposed assessment guidelines at TREC-BLOG/BlogDistillationAssessmentGuidelines.

Topic Development Phase

We need each participating group to create 6 or 7 topics for this task. Your aim is to identify some topics 'X', and a few (e.g. 2 or 3) relevant feeds (identified by their feedno), and send them in an email with two or three relevant feeds to ian.soboroff (AT SYMBOL) nist.gov PLEASE DO NOT POST THEM TO THE MAILING LIST

Format:

<top>
<title> a short query-ish title </title>

<desc> Description:
The desc is a sentence-length description of what you are looking for, and should include the title words.
</desc>

<narr> Narrative:
The narr is a paragraph-length description of what you are looking for.  Use it to give details on what feeds or blogs are relevant and what feeds or blogs are not.  If there are "gray areas", state them here.
</narr>

<feeds>
feedno
feedno
feedno
</feeds>

<comments>
Anything else you want to say.
</comments>

</top>

Example:

<top>
<title> solaris </title>

<desc> Description:
Blogs describing experiences administrating the Solaris operating system, or its new features or developments.
</desc>

<narr> Narrative:
Relevant blogs will post regularly about administrating or using the Solaris operating system from Sun, its latest features or developments. Blogs with posts about Solaris the movie are not relevant, not are blogs which only have a few posts Solaris.</narr>

<feeds>
*BLOG06-feed-053948 BLOG06-feed-078402 BLOG06-feed-018020* </feeds>

<comments>
None.
</comments>

</top>

The topic development phase ends in strictly 2 weeks: all topics should be submitted to Ian by end of Sunday 24th June.

Topic Development System

To help the participating groups in creating their blog distillation topics, we have provided a standard search system for *documents* on the Blog06 collection, but it also displays the feeds for each documents, and moreover, you can view all the documents for a given feed. You can access it from: [WWW] http://ir.dcs.gla.ac.uk/terrier/search_blogs06/ [This was turned off after the release of the topics]

If you have your own search system for the Blogs06 collection (say, from last year's track), feel free to use that.

You don't need to state all the relevant feeds for a topic, as there will be a separate assessment phase in September, after all runs have been submitted.

Evaluation

Participants can submit up to 4 runs. Each run has feeds ranked by their likelihood of having an principle (recurring) interest in the topic. We suggest that up to 100 feeds are returned per topic. As usual, one automatic, title-only run is required.

In contrast to the Opinion-Finding task runs, submitted runs to the blog distillation task will follow the ususal trec_eval format, ie.

  topic Q0 feedno rank sim runtag

where

  topic is the topic number
  Q0 is a literal "Q0"  (a historical field...)
  feedno is the feed number of the blog (BLOG06-feed-......)
rank is the rank at which the system returned the document (1 .. n)
  sim is the system's similarity score
  runtag is the run's identifier string

Timeline

  • 9th June: Blog Distillaion Task Topic Development Starts

  • 24th June: Blog Distillaion Task Topic Development Ends

  • 25th June: Opinion Task topics posted

  • 30th June: Blog Distillation task Topics posted

  • 6th August: Opinion Task runs due

  • 17th August: Blog Distillation runs due

  • 3rd September: Blog Distillation Task- Participants Judging Phase starts

  • 20th September: Opinion finding results sent to particpants

  • 30th September: Blog Distillation Task - Participants Judging Phase ends.

  • October: All Relevance judgements sent to participants

  • 6-9 November: TREC 2007

History of Document

  • February 24, 2008: added TREC Blog track 2007 campaign wiki page to the archives

  • June 8, 2007: timeline updated for topics posted, both tasks

  • May 29, 2007: timeline updated for opinion task

  • March 9, 2007: updated for tasks

  • February 5, 2007: first draft

Track Coordinators

Please cross-ref using TREC-BLOG

last edited 2008-02-25 16:37:48 by IadhOunis