SPEAKER: Tomas Mikolov
Brno University of Technology, Czech Republic
TITLE: "Language modeling with recurrent neural networks"
Statistical language models are important part of almost any speech recognition system today. The most basic but also the most successful models so far are based on n-gram statistics. Comparison of performance of different language modeling techniques on different tasks will be presented. Among all, neural network based language models perform the best. Next, useful extensions of the basic neural network model as well as different architectures will be discussed, such as recurrent neural network architecture, classes in the output layer and joint training with a maximum entropy model. Next, I will present results achieved with a novel RNNME model (recurrent neural network trained together with a maximum entropy model), on a state-of-the-art setup from IBM for Broadcast News speech recognition (NIST RT04). Word error rate reductions over large 4-gram model are over 10%. Previously the best language model on this setup, a so-called "model M" from IBM (regularized class-based maximum entropy model), provides about 5% reduction of WER over 4-gram model. Finally, I will talk about character-level and subword-level language modeling experiments with different models, including a recently proposed RNN model trained with a new Hessian-Free optimizer. These models can assign meaningful probability to any words, and can be considered as a solution to well known problems with infinite vocabularies (OOV problems). Moreover, their size is significantly lower than of standard models.
This talk presents joint work with Anoop Deoras, Ilya Sutskever, Stefan Kombrink and Hai Son Le.