Recognising Textual Entailment
- Description of This Year's Contest (quoted from the same):
- "The goal of the RTE Track is to develop systems that recognize when one piece of text entails another. [...] The 2008 RTE Track will include the 3-way classification [...] of `YES', `NO' and `UNKNOWN'."
-
- Current approach (as submitted in the application)
- We have pulled together several theoretically and procedurally similar broad coverage systems into a tool called Sfy. The primary components are the Linguistic Knowledge Builder (LKB), SNePS, and the English Resource Grammar (ERG). LKB is a general-purpose constraint-based parser that uses ERG's language specifications to translate the input into a flat formal semantic representation. SNePS, as a knowledge-representation and reasoning system, can compare the formal representations for various types of entailment. SNePS also provides a means for either integrating the new information with known information or acting to find unknown but pertinent information from known sources.
Outline of TODO's
- Parse XML files with dataset
- For each sentence ID
- Add the T and H to the KB
- Mark each entity
- Link possible matches
- Add certainty rating to links
Tools
- Scoring (for 2008)
- Based on the success of the three-way scoring, the new formula uses two metrics with no regard for answer ranking: accuracy and Fbeta=3. High precision is preferred over high accuracy.
- Scoring (for 2007)
- The competition had a crazy grading scale. Here's a simple perl script to do the scoring (to the best of my knowledge. Oh yeah, you may want the gold standard of the trial set to test the script.