Elsevier

Progress in Neurobiology

Volume 51, Issue 2, February 1997, Pages 167-194
Progress in Neurobiology

INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM

https://doi.org/10.1016/S0301-0082(96)00054-8Get rights and content

Abstract

Neurophysiological evidence is described, showing that some neurons in the macaque temporal cortical visual areas have responses that are invariant with respect to the position, size and view of faces and objects, and that these neurons show rapid processing and rapid learning. A theory is then described of how such invariant representations may be produced in a hierarchically organized set of visual cortical areas with convergent connectivity. The theory proposes that neurons in these visual areas use a modified Hebb synaptic modification rule with a short-term memory trace to capture whatever can be captured at each stage that is invariant about objects as the object changes in retinal position, size, rotation and view. Simulations are then described which explore the operation of the architecture. The simulations show that such a processing system can build invariant representations of objects. © 1977 Elsevier Science Ltd. All Rights Reserved.

Section snippets

INTRODUCTION

This paper draws together evidence on how information about visual stimuli is represented in the temporal cortical visual areas, and on how representations that are invariant with respect to the position, size and even view of objects are formed. The evidence comes from neurophysiological studies of single neuron activity in primates. It also comes from closely related theoretical studies which consider how the representations may be set up by learning in a multistage cortical architecture. The

Visual Cortical Areas in the Temporal Lobes

Visual pathways project via a number of cortico-cortical stages from the primary visual cortex until they reach the temporal lobe visual cortical areas (Seltzer and Pandya, 1978; Maunsell and Newsome, 1987; Baizer et al., 1991). The inferior temporal visual cortex. area TE, is divided into a set of subareas, and in addition there is a set of different areas in the cortex in the superior temporal sulcus (Seltzer and Pandya, 1978; Baylis et al., 1987) (see Fig. 1). Of these latter areas, TPO

A NETWORK MODEL OF INVARIANT VISUAL OBJECT RECOGNITION

To test and clarify the hypotheses just described about how the visual system may operate to learn invariant object recognition, Wallis and Rolls developed a simulation which implements many of the ideas just described, and is consistent with and based on much of the neurophysiology summarized above. The network simulated, visnet, can perform object, including face, recognition in a biologically plausible way, and after training shows for example translation and view invariance (Wallis et al.,

COMPARISON OF DIFFERENT APPROACHES TO INVARIANT OBJECT RECOGNITION

The findings described in Section 3show that the proposed trace learning mechanism and neural architecture can produce cells with responses selective for stimulus type with considerable position, view and size invariance. We now compare to other approaches the proposal made here and by Rolls (1992b), Rolls (1994), 1995, Rolls (1996a) and investigated by simulation using VisNet, about how the visual cortical areas may solve the problem of forming invariant representations.

The trace rule is local

Acknowledgements

The authors have worked on some of the investigations described here with P. Azzopardi, G.C. Baylis, M. Booth, M. Elliffe, P. Foldiak, M. Hasselmo, C.M. Leonard, G. Littlewort, T.J. Milward, D.I. Perrett, M.J. Tovee and A. Treves, and their collaboration is sincerely acknowledged. The authors are grateful to Dr Peter Foldiak for help and advice in preparing this manuscript, and to Dr Roland Baddeley of the MRC Interdisciplinary Research Centre in Brain and Behaviour at Oxford, and Dr L. Abbott,

References (113)

  • E.T. Rolls

    Learning mechanisms in the temporal lobe visual cortex

    Behav. Brain Res.

    (1995)
  • E.T. Rolls et al.

    Role of low and high spatial frequencies in the face-selective responses of neurons in the cortex in the superior temporal sulcus

    Vision Res.

    (1985)
  • E.T. Rolls et al.

    The responses of neurons in the cortex in the superior temporal sulcus of the monkey to band-pass spatial frequency filtered faces

    Vis. Res.

    (1987)
  • B. Seltzer et al.

    Afferent cortical connections and architectonics of the superior temporal sulcus and surrounding cortex in the rhesus monkey

    Brain Res.

    (1978)
  • M.J. Tarr et al.

    Mental rotation and orientation-dependence in shape recognition

    Cognit. Psychol.

    (1989)
  • L.A. Abbott et al.

    Representational capacity of face coding in monkeys

    Cerebral Cortex

    (1996)
  • Baddeley, R.J., Wakeman, E., Booth, M., Rolls, E.T. and Abbott, L.F. (1997) The distribution of firing rates of primate...
  • J.S. Baizer et al.

    Organization of visual inputs to the inferior temporal and posterior parietal cortex in macaques

    J. Neurosci.

    (1991)
  • Ballard, D.H. (1990) Animate vision uses object-centred reference frames. In: Advanced Neural Computers, pp. 229–236....
  • Ballard, D.H. (1993) Subsymbolic modelling of hand-eye co-ordination. In: The Simulation of Human Intelligence, Ch. 3,...
  • H.B. Barlow

    Single units and sensation: a neuron doctrine for perceptual psychology?

    Perception

    (1972)
  • Barlow, H.B. (1985) Cerebral cortex as model builder. In: Models of the Visual Cortex, pp. 37–46. Eds D. Rose and V.G....
  • H.B. Barlow et al.

    Finding minimum entropy codes

    Neural Computat.

    (1989)
  • G.C. Baylis et al.

    Functional subdivisions of temporal lobe neocortex

    J. Neurosci.

    (1987)
  • G.C. Baylis et al.

    Responses of neurons in the inferior temporal cortex in short term and serial recognition memory tasks

    Expl Brain Res.

    (1987)
  • A. Bennett

    Large competitive networks

    Network

    (1990)
  • D. Boussaoud et al.

    Visual topography of area TEO in the macaque

    J. Comp. Neurol.

    (1991)
  • B.G. Breitmeyer

    Unmasking visual masking: a look at the “why” behind the veil of the “how”

    Psychol. Rev.

    (1980)
  • T.H. Brown et al.

    Hebbian synapses: biological mechanisms and algorithms

    Ann. Rev. Neurosci.

    (1990)
  • Buhmann, J., Lades, M. and von der Malsburg, C. (1990) Size and distortion invariant object recognition by hierarchical...
  • Buhmann, J., Lange, J., von der Maslburg, C., Vorbrüggen, J.C. and Würtz, R.P. (1991) Object recognition in the dynamic...
  • H. Bülthoff et al.

    Psychophysical support for a two-dimensional view interpolation theory of object recognition

    Proc. natn. Acad. Sci. U.S.A.

    (1992)
  • P. Cavanagh

    Size and location invariance in the visual system

    Perception

    (1978)
  • Chakravarty, I. (1979) A generalized line and junction labelling scheme with applications to scene analysis. IEEE...
  • J.A. Feldman

    Four frames suffice: a provisional model of vision and space [see p. 279]

    Behav. Brain Sci.

    (1985)
  • P. Foldiak

    Learning invariance from transformation sequences

    Neural Comp.

    (1991)
  • K. Fukushima

    Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position

    Biol. Cybernet.

    (1980)
  • Gross, C.G. (1973) Visual functions of the inferotemporal cortex. In: Handbook of Sensory Physiology, pp. 451–482....
  • C.G. Gross et al.

    Inferior temporal cortex and pattern recognition

    Expl Brain Res. Suppl.

    (1985)
  • M.E. Hasselmo et al.

    Object-centered encoding by face-selective neurons in the cortex in the superior temporal sulcus of the monkey

    Expl Brain Res.

    (1989)
  • Hawken, M.J. and Parker, A.J. (1987) Spatial properties of the monkey striate cortex. Proc. R. Soc. London [B] 231,...
  • Hertz, J., Krogh, A. and Palmer, R.G. (1991) Introduction to the Theory of Neural Computation. Addison-Wesley:...
  • Hinton, G.E. (1981) A parallel computation that assigns canonical object based frames of reference. In: Proceedings of...
  • J.E. Hummel et al.

    Dynamic binding in a neural network for shape recognition

    Psychol. Rev.

    (1992)
  • Humphreys, G.W. and Bruce, V. (1989) Visual Cognition. Erlbaum: Hove,...
  • J.J. Koenderink et al.

    The internal representation of solid shape with respect to vision

    Biol. Cybernet.

    (1979)
  • G. Kovacs et al.

    Cortical correlate of pattern backward masking

    Proc. Natn. Acad. Sci.

    (1995)
  • Linsker, E. (1986) From basic network principles to neural architecture. Proc. natn. Acad. Sci. U.S.A., 83, 7508–7512,...
  • Marr, D. (1982) Vision. W.H. Freeman: San...
  • J.H.R. Maunsell et al.

    Visual processing in monkey extrastriate cortex

    Ann. Rev. Neurosci.

    (1987)
  • Cited by (0)

    *

    Present address: Max-Planck Institut für biologische Kybernetik, Spemannstrasse 38, 72076 Tübingen, Germany.

    View full text