Speaker: Shalom Lappin, King's College London
Title: Predicting Grammaticality Judgements with Enriched Language Models
I present recent experimental work on unsupervised language models trained on large corpora. We apply scoring functions to the probability distributions that the models generate for a corpus of test sentences. The functions discount the role of sentence length and word frequency, while highlighting other properties, in determining a grammaticality score for a sentence. The test sentences are annotated by Amazon Mechanical Turk crowd sourcing. Some of the models andscoring functions produce encouraging Pearson correlations with the mean human judgements. I also describe current work on other corpus domains, cross domain training and testing, and grammaticality prediction in other languages. Our results provide experimental support for the view that syntactic knowledge is represented as a probabilistic system, rather than as a classical formal grammar.
Joint work with Jey Han Lau and Alexander Clark