Speaker: Paul Cook, University of Melbourne
Title: Automatic identification of novel word-senses
Automatic lexical acquisition has been an active area of research in computational linguistics for over 20 years, but the automatic identification of lexical semantic change has only recently received attention. In this talk we first present a non-parametric Bayesian word-sense induction (WSI) method and its evaluation on several SemEval WSI tasks. We then apply this method to identify novel word-senses --- senses present in one corpus, but not another. One impediment to research on lexical semantic change has been a lack of appropriate evaluation resources. In this talk we further present the largest corpus-based dataset of diachronic sense differences to date. In experiments on two different corpus pairs, we show that our method is able to simultaneously identify: (a) types having taken on a novel sense over time, and (b) the token instances of such novel senses. We further show that the performance of this method can be improved through the incorporation of social knowledge about the likely topics of new word-senses. Finally, we present a lexicographer's assessment of our method in the context of updating a dictionary.