Speaker: Heike Zinsmeister
Title: Towards Gold Corpora for Abstract Anaphora Resolution
Abstract anaphora refer to anaphoric elements, such as that or this issue, that refer to abstract referents such as facts or events. The antecedents of abstract anaphors are often realised as verbal or clausal categories as in example (1) adapted from Byron (2002), which poses problems for the automatic resolution of the anaphoric relation.
(1) Each Fall, penguins migrate to Fiji. [That]'s why I'm going there next month [that -- namely the fact that each fall, penguins migrate to Fiji -- is why ...]
The resolution problem can be split into three subtasks: (i) deciding whether an anaphoric element refers to an abstract or a concrete referent, (ii) identifying the antecedent string, (iii) inducing the abstract referent.
When creating a gold standard in this domain, it is easy for human annotators to agree on the first task. It is much harder to get reliable data with respect to the other two tasks. I will present a survey on annotation projects and discuss how they approach this challenge.
Furthermore, I will outline ongoing work on cross-linguistic annotation of abstract anaphora in a parallel corpus of English and German, that also addresses the question of the reliability of translated texts as a source for feature induction.