TREC Entity Track // Searching for entities and properties of entities

Task proposal: Attribute identification

Given a list of entities (with names and homepages) return a list of key aspects.

Input

  • Names and homepages of an entity
  • Narrative (which attributes the user is and is not interested in)

Output

  • List of key aspects (as strings)

Example

<query>
    <entity_name>Nokia 6650</entity_name>
    <entity_URL>http://www.nokiausa.com/A41195639</entity_URL>
    <entity_name>iPhone</entity_name>
    <entity_URL>http://www.apple.com/iphone//</entity_URL>
    <entity_name>Blackberry Storm</entity_name>
    <entity_URL>http://na.blackberry.com/eng/devices/blackberrystorm/</entity_URL>
<narrative>I want to buy a high-tech phone. I’m interested in their technical features.</narrative>
</query>

The “ideal output” (in the opinion of the topic creator) would contain

camera
display
memory
size and weight
battery life
bluetooth
gps
video on demand

and would certainly not contain

phone
cellphone
device
data
provider
software

Issues

  • What is a key aspect? “In the opinion of the assessors”. Still, it’s not clear what the appropriate granularity is.
    • Possible solution: topic definition includes negative examples (examples of entity types we are not interested in).
    • Con: The concept of “key aspect” is still ill-defined. E.g., given a list of actors as an input, what about “award” vs “Oscar”?
  • How to evaluate? The pool of attributes can be huge, with many variations on the very same feature/aspect.
    • Possible solution: cluster attributes (requires a convenient assessment interface).
    • Con: Re-usability of the judgments.

2 Responses to “Task proposal: Attribute identification”

  1. Michaelcn

    Are we allowed to use some other resources that are not given by the Track?

  2. kbalog

    Yes (unless that resource means an even larger web crawl)

Leave a Reply

You must be logged in to post a comment.