Speaker: Eduard Hovy
University of Southern California
Title: "NLP: Its Past and 3½ Possible Futures"
Natural Language text and speech processing (Computational Linguistics) is just over 50 years old, and is still continuously evolving — not only in its technical subject matter, but in the basic questions being asked and the style and methodology being adopted to answer them. As unification followed finite-state technology in the 1980s, statistical processing followed that in the 1990s, and large-scale processing is increasingly being adopted (especially for commercial NLP) in this decade, a new and quite interesting trend is emerging: a split of the field into three somewhat complementary and rather different directions, each with its own goals, evaluation paradigms, and methodology. The resource creators focus on language and the representations required for language processing; the learning researchers focus on algorithms to effect the transformation of representation required in NLP; and the large-scale hackers produce engines that win the NLP competitions. But where the latter two trends have a fairly well-established methodology for research and papers, the first doesn't, and consequently suffers in recognition and funding. In the talk, I describe each trend, provide some examples of the first, and conclude with a few general questions, including: Where is the heart of NLP? What is the nature of the theories developed in each stream (if any)? What kind of work should one choose to do if one is a grad student today?