Professor, Department of Computer Science UofT
2010 Recipient of the Gerhard Herzberg Canada
Gold Medal for Science and Engineering
Title: Does the brain do inverse graphics?
Recognizing a familiar shape from the pattern of light intensities on the retina is difficult because changes in viewpoint can dramatically change the pattern of light intensities. The general belief among both neuroscientists and neural network modelers is that the infero-temporal pathway copes with viewpoint variation by using multiple levels of representation in which each level is slightly more viewpoint-invariant than the level below. Unfortunately, this approach cannot explain how we can be acutely sensitive to the precise spatial relationships between high-level parts such as a nose and a mouth.
From an engineering perspective, the natural way to deal with spatial relationships is to associate a vector of pose parameters with each recognized part and to make use of the fact that spatial relationships can then be modeled very efficiently using linear operations. This is what is done in computer graphics and it is the reason why computer graphics can deal with changes in viewpoint so easily. Despite the long history of generative models of perception, computational neuroscientists have not taken this aspect of computer graphics seriously, possibly because they do not believe that the brain can do linear algebra. I shall show how a neural net can learn to extract parts with explicit pose parameters from an image and how this makes it very easy to recognize spatial configurations of parts under a very wide range of viewpoints. I shall then sketch a way in which the brain could use spike timing to implement the required linear algebra very efficiently.
About the speaker:
Geoffrey Hinton designs machine learning algorithms. His aim is to discover a learning procedure that is efficient at finding complex structure in large, high-dimensional datasets and to show that this is how the brain learns to see. His contributions include back-propagation, Boltzmann machines, distributed representations, mixtures of experts, variational learning, and deep belief nets. His current research uses local “capsules’’ composed of a few hundred neurons that perform complicated internal computations on their inputs and encapsulate the results into a small vector of highly informative outputs that reveal the hidden structure in the data.