Review
Evidence from auditory and visual event-related potential (ERP) studies of deviance detection (MMN and vMMN) linking predictive coding theories and perceptual object representations

https://doi.org/10.1016/j.ijpsycho.2011.10.001Get rights and content

Abstract

Predictive coding theories posit that the perceptual system is structured as a hierarchically organized set of generative models with increasingly general models at higher levels. The difference between model predictions and the actual input (prediction error) drives model selection and adaptation processes minimizing the prediction error. Event-related brain potentials elicited by sensory deviance are thought to reflect the processing of prediction error at an intermediate level in the hierarchy. We review evidence from auditory and visual studies of deviance detection suggesting that the memory representations inferred from these studies meet the criteria set for perceptual object representations. Based on this evidence we then argue that these perceptual object representations are closely related to the generative models assumed by predictive coding theories.

Highlights

►A joint review of electric brain potentials of auditory and visual deviance detection is provided. ►Electric brain signals elicited by sensory deviance are suggested to reflect prediction errors. ►Generative models of sensory regularities are proposed to serve as perceptual object representations. ►Sensory deviance detection results are interpreted in terms of predictive coding theories.

Introduction

Helmholtz's (1860/1962) notion of unconscious inference engendered arguably the most fruitful line of perceptual research throughout the relatively short history of psychology, the empiricist tradition. In one of its contemporary variants, Gregory (1980) suggests that perception is akin to scientific hypotheses: it is the brain's best-fitting model for the information entering the senses. But together with Gordon (1997) we can ask how these models are formed, what evidence they are tested against, and how they adapt to an ever changing environment? To answer these questions, some of the theories of predictive coding (Creutzig and Sprekeler, 2008, Dayan et al., 1995, Friston, 2005, Friston, 2010, Hohwy, 2007, Mumford, 1992, Rao and Ballard, 1999, Schütz-Bosbach and Prinz, 2007) evoke the principle of free-energy minimization (e.g., Friston, 2005, Friston, 2010).

Predictive coding theories suggest that the perceptual system's primary objective is to minimize the discrepancy between predictions from its internal generative models of the environment and the actual sensory input. Structured as a hierarchy of models of increasing levels of abstraction, predictions from each level are tested on data emerging one level lower with the difference (termed the “error signal” or “prediction error”) being passed upwards in the hierarchy (for non-mathematical descriptions, see Baldeweg, 2007, Hohwy et al., 2008). The error signal then governs model selection/adjustment in order to minimize prediction error throughout the system. Thus predictive coding theories implement the analysis by synthesis principle (Neisser, 1967, Yuille and Kersten, 2006) and conform to the notion of gist-first processing suggested by some recent theories of perception (Ahissar and Hochstein, 2004, Bar, 2004, Bar, 2007), whereby higher-level (more general) models govern the interpretation (model selection) at lower levels. Predictive coding theories acknowledge the stochastic nature of the information entering the senses, a notion that has long been argued by an early theorist of perception, Egon Brunswik (1956). Dealing with probability distributions instead of discrete values, predictive coding theories assume that the brain follows Bayesian inference rules in model selection (Kersten et al., 2004, Knill and Pouget, 2004, Yuille and Kersten, 2006). Models based on hierarchical Bayesian inference using hierarchical generative models represent a recent development in the field (Friston and Kiebel, 2009, Lee and Mumford, 2003). In a hierarchical setting, the predictions from higher levels play the role of empirical priors on representations in lower levels. This resolves concerns about where priors come from and makes (empirical) priors accountable to sensory data. Thus sensory data is used to update the evaluation (the probability of the correctness) of existing models. In the end, the model with the highest probability of being correct determines the (conscious) percept.

Thus, according to these theories, the general makeup of the afferent system1 is divided into 1) neuronal circuits implementing the generative models and setting up lower levels in the hierarchy and 2) circuits determining prediction errors and passing them onto higher levels (Friston, 2005). However, whereas the determination of prediction errors is quite clear, the make-up of the corresponding generative models is rather unspecified beyond the principles of Bayesian inference processing. A consequence of this imbalance of detail between the two assumed functional units of predictive coding theories is that most neuroscience evidence interpreted in favor of predictive coding comes from observing neuronal activity that shows effects expected of processing prediction errors. The major sources of such evidence are single-cell data and simulations (Grill-Spector et al., 2006, Hosoya et al., 2005, Jehee and Ballard, 2009, Lee and Mumford, 2003, Wang et al., 2006), local field potentials (Kumar et al., in press), and large-scale brain responses (Alink et al., 2010, Aoyama et al., 2005, den Ouden et al., 2010, Murray et al., 2002); each showing reduced activity for predicted as compared with unpredicted sensory input. There exist also behavioral data compatible with what is expected from a system working on Bayesian principles (den Ouden et al., 2010, Ernst and Banks, 2002, Hohwy et al., 2008, Weiss et al., 2002, Yu, 2007). However, the representation and maintenance of the generative models received less elaboration so far.

Psychological theories agree on that the overall function of perception is to discover the sources of the information entering the senses, because knowledge about these objects and events can be utilized to reach survival and reproduction goals (e.g., Brunswik, 1956). Thus behavior is influenced by the distal objects and events. Even when behavior is apparently controlled by a single feature (e.g. we pick up a cherry by its color), the feature belongs to an object. Therefore, psychological theories have for a long time assumed the existence of brain representations for objects and suggested that incoming sensory information is stored and manipulated in such units in the brain. The question addresses here is how these representations relate to the multi-leveled generative models of predictive coding theories?

The representations inferred from studies measuring the mismatch negativity (MMN: Näätänen et al., 1978, for a recent review, see Näätänen et al., 2011) event-related brain potential (ERP) and its visual counterpart (vMMN: Tales et al., 1999, Heslenfeld, 2003, for recent reviews, see Czigler, 2007, Czigler, 2010, Kimura, 2012) may provide a useful link between these two views of perception. MMN and vMMN are elicited when the incoming stimulus violates some regular feature detected from the preceding sequence. MMN was discovered in the context of the auditory oddball paradigm. Occasionally exchanging a repetitive sound (termed, the “standard”) for a different one (termed, the “deviant”) elicited a fronto-centrally negative ERP response (MMN) peaking between 100 and 200 ms from the onset of the deviance (typically the sound onset). MMN was initially described as an ERP correlate of detecting a mismatch between the memory trace of the repeating sound and that of the incoming one (Näätänen et al., 1978). Research in the past thirty years demonstrated that MMN is also elicited by violations of regularities which are more complex than stimulus repetition, including such regularities in which each sound is specified by the immediately preceding one (Paavilainen et al., 2007, Horváth et al., 2001). These and similar evidence as well as a detailed analysis of the alternative interpretations (see Winkler, 2007) led to the hypothesis that 1) memory representations of the detected regularities are generative models providing predictions about upcoming sensory events and 2) MMN is elicited when the current stimulus does not match these predictions (Baldeweg, 2006, Baldeweg, 2007, Garrido et al., 2009c, Näätänen et al., 2011, Sinkkonen, 1999, Winkler, 2007, Winkler et al., 1996; see also Bendixen et al, 2012-this issue). Winkler and Czigler (1998) further argued that the function of the MMN signal is to update the regularity representations violated by the deviant stimulus (see also Winkler, 2007). Thus, in terms of predictive coding theories, MMN can be regarded as a signal carrying the prediction error (Garrido et al., 2009c).

Based on the above interpretation of MMN, the memory representations reflected in the MMN ERP component may be compatible with the generative models assumed in predictive coding descriptions of perception. We previously suggested (Winkler, 2010, Winkler et al., 2009) that the representations inferred from MMN studies meet the criteria set for auditory object representations. Thus results obtained in studies of the auditory and visual MMN may provide a link between the predictive coding view of perception and the psychological literature of perceptual object representations.

Here we review results of studies measuring the auditory and visual MMN offering (indirect) evidence about the nature of perceptual object representations. The aims of the review are 1) to compare characteristics of the object representations in the two modalities and 2) to assess how well they fit into a generalized predictive coding account of perception.2

Section snippets

Object representations and (v)MMN

Objects serve as perceptual units, as was first emphasized by Gestalt psychologists (Köhler, 1947) and they are also the units of attentional selection (e.g., Duncan, 1984). The first difference between the two (auditory and visual) modalities lies in what constitutes this unit of representation. That is, what is a perceptual object? Whereas in vision, object representations unequivocally refer to physical objects in the environment, in the auditory modality, two different perceptual units can

Similarities and differences between auditory and visual object representations

In the previous section, we showed that both auditory and visual memory representations, as inferred from studies of deviance detection, possess the characteristics expected of perceptual object representations. The evidence described above painted a picture of similar representations across the two modalities. Here we briefly review the possible differences between the representations in the two modalities.

The vast majority of vMMN studies adapted designs developed for auditory research. This

Are perceptual object representations compatible with the generative models postulated by predictive coding theories?

In terms of predictive coding theories, the organism's knowledge about the world is encoded in generative models. In hierarchical predictive coding models, the system comprises nested levels with error signals propagating upwards and predictions propagating downwards. This recurrent or reciprocal message passing among levels of the hierarchical model enables the model to be optimized or adjusted; thereby selecting the best explanation for the current sensory input. No level has special

Summary

We reviewed evidence suggesting that the memory representations involved in auditory and visual deviance detection meet the criteria set for perceptual object representations. We discussed the similarities and differences between these representations in the two modalities. Finally, we hypothesized that the memory representations involved in deviance detection are closely related to the generative models assumed by predictive coding theories.

Acknowledgments

This work was supported by the European Community's Seventh Framework Programme FP7 (Challenge 2 — Cognitive Systems, Interaction, Robotics) under grant agreement 231168-SCANDLE (to I.W.) and the Hungarian National Research Fund (OTKA) under grant agreement 71600 (to I.C.).

References (204)

  • K. Friston et al.

    Cortical circuits for perceptual inference

    Neur. Net.

    (2009)
  • M.I. Garrido et al.

    The functional anatomy of the MMN: a DCM study of the roving paradigm

    NeuroImage

    (2008)
  • M.I. Garrido et al.

    Repetition suppression and plasticity in the human brain

    NeuroImage

    (2009)
  • M.I. Garrido et al.

    The mismatch negativity: a review of underlying mechanisms

    Clinical Neurophysiology

    (2009)
  • K. Grill-Spector et al.

    Repetition and the brain: neural models of stimulus-specific effects

    Trends in Cognitive Sciences

    (2006)
  • S. Harnad

    The symbol grounding problem

    Physica D

    (1990)
  • J. Hohwy et al.

    Predictive coding explains binocular rivalry: an epistemological review

    Cognition

    (2008)
  • J. Horváth et al.

    Simultaneously active pre-attentive representations of local and global rules for sound sequences

    Cognitive Brain Research

    (2001)
  • J. Horváth et al.

    The temporal window of integration in elderly and young adults

    Neurobiology of Aging

    (2007)
  • J. Horváth et al.

    DoN1/MMN, P3a, and RON form a strongly coupled chain reflecting the three stages of auditory distraction?

    Biological Psychology

    (2008)
  • M. Huotilainen et al.

    Long-term memory traces facilitate short-term memory trace formation in audition in humans

    Neuroscience Letters

    (2001)
  • T. Jacobsen et al.

    Pre-attentive auditory processing of lexicality

    Brain and Language

    (2004)
  • D. Kahneman et al.

    The reviewing object files: object-specific integration of information

    Cognitive Psychology

    (1992)
  • M. Kimura

    Visual mismatch negativity and unintentional temporal-context-based prediction in vision

    Int. J. Psychophysiol.

    (2012)
  • D. Knill et al.

    The Bayesian brain: the role of uncertainty in neural coding and computation

    Trends in Neurosciences

    (2004)
  • O.A. Korzyukov et al.

    Processing abstract auditory features in the human auditory cortex

    NeuroImage

    (2003)
  • S.A. Kotz et al.

    Cortical speech processing unplugged: a timely subcortico-cortical framework

    Trends in Cognitive Sciences

    (2010)
  • J. Kremlácek et al.

    Visual mismatch negativity elicited by magnocellular system activation

    Vision Research

    (2006)
  • M. Kubovy et al.

    Auditory and visual objects

    Cognitive

    (2001)
  • O. Aaltonen et al.

    Perceptual magnet effect in the light of behavioral and psychophysiological data

    Journal of the Acoustical Society of America

    (1997)
  • Alain, C., Winkler, I., in press. Auditory scene analysis in the human brain: Evidence from neuroelectric recording. In...
  • A. Alink et al.

    Stimulus predictability reduces responses in primary visual cortex

    Journal of Neuroscience

    (2010)
  • L.A. Anderson et al.

    Stimulus-specific adaptation occurs in the auditory thalamus

    Journal of Neuroscience

    (2009)
  • P. Astikainen et al.

    Event-related potentials to task-irrelevant changes in facial expressions

    Behavioral and Brain Functions

    (2009)
  • T. Baldeweg

    ERP repetition effects and mismatch negativity generation: a predictive coding perspective

    Journal of Psychophysiology

    (2007)
  • M. Bar

    Visual objects in context

    Nature Reviews Neuroscience

    (2004)
  • A. Bendixen et al.

    Regularity extraction and application in dynamic auditory stimulus sequences

    Journal of Cognitive Neuroscience

    (2007)
  • S. Berti

    The attentional blink demonstrates automatic deviance processing in vision

    NeuroReport

    (2011)
  • A. Boemio et al.

    Hierarchical and asymmetric temporal sensitivity in human auditory cortices

    Nature Neuroscience

    (2005)
  • A.S. Bregman

    Auditory streaming: competition among alternative organizations

    Perception & Psychophysics

    (1978)
  • A.S. Bregman

    Auditory Scene Analysis

    The Perceptual Organization of Sound

    (1990)
  • A.S. Bregman et al.

    Primary auditory stream segregation and perception of order in rapid sequences of tones

    Journal of Experimental Psychology

    (1971)
  • E. Brunswik

    Perception and the Representative Design of Psychological Experiments

    (1956)
  • A. Bubic et al.

    Prediction, cognition and the brain

    Front. Human Neurosci.

    (2010)
  • V. Carral et al.

    A kind of auditory ‘primitive intelligence’ already present at birth

    European Journal of Neuroscience

    (2005)
  • X. Chang et al.

    Dysfunction of processing task-irrelevant emotional faces in major depressive disorder patients revealed by expression-related visual MMN

    Neuroscience Letters

    (2010)
  • M. Cheour et al.

    Development of language specific phoneme representations in the infant brain

    Nature Neuroscience

    (1998)
  • M. Coltheart

    Sensory memory — a tutorial review

  • N. Cowan

    On short and long auditory stores

    Psychological Bulletin

    (1984)
  • N. Cowan et al.

    Short- and long-term prerequisites of the mismatch negativity in the auditory event related potential (ERP)

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1993)
  • Cited by (0)

    Contribution to the Special Issue titled “Predictive information processing in the brain: Principles, neural mechanisms and models” edited by J. Todd, E. Schröger, and I. Winkler.

    View full text