Speaker: Alex Graves
Title: Sequence Transduction with Recurrent Neural Networks
Sequence transduction -- the transformation of input sequences into output sequences -- is a broad class of machine learning tasks, including speech and handwriting recognition, text-to-speech, machine translation and many more. In general the length of the output sequence is unknown a priori, and may be longer or shorter than the input sequence. Recurrent neural networks are in principle capable of learning almost any sequence-to-sequence mapping, and therefore seem a promising approach to sequence transduction. However the standard training methods require known targets for every point in the input sequence, which in turn predefines the length of the output sequence. This talk presents a way of combining a 'transcription' model of the input data with a 'prediction' model of the output data to determine a distribution over output sequences of all lengths, given a single input sequence. It then shows how both models can be implemented by recurrent neural networks to create a general-purpose sequence transducer.