Review
Modeling the auditory scene: predictive regularity representations and perceptual objects

https://doi.org/10.1016/j.tics.2009.09.003Get rights and content

Predictive processing of information is essential for goal-directed behavior. We offer an account of auditory perception suggesting that representations of predictable patterns, or ‘regularities’, extracted from the incoming sounds serve as auditory perceptual objects. The auditory system continuously searches for regularities within the acoustic signal. Primitive regularities may be encoded by neurons adapting their response to specific sounds. Such neurons have been observed in many parts of the auditory system. Representations of the detected regularities produce predictions of upcoming sounds as well as alternative solutions for parsing the composite input into coherent sequences potentially emitted by putative sound sources. Accuracy of the predictions can be utilized for selecting the most likely interpretation of the auditory input. Thus in our view, perception generates hypotheses about the causal structure of the world.

Section snippets

Prediction underlies adaptive behavior

Achieving one's goals in constantly changing environments requires actions directed at future states of the world. For example, when crossing a street, one has to anticipate the location of cars at the moment when one is likely to intersect their trajectories. Predicting future events is essential for everything we do, from taking into account the immediate sensory consequences of our own actions to signing up to a pension plan. The realization that we constantly interact with the future led to

Predictive representations in analyzing the auditory scene

Orderly perception of complex auditory scenes requires them to be broken down into internally coherent constituents. According to Bregman's theory [6] (see Box 1), auditory scene analysis (ASA) consists of two phases; the first phase is concerned with the formation of alternative sound organizations, while the second is concerned with selecting one of the alternatives to be perceived. Although perceptually it is difficult to separate these processes, the existence of the two phases was

Maintaining the representation of the auditory scene

Once possible object representations are formed, inconsistencies between them need to be resolved while preferably maintaining the continuity of perception. Figure 1 shows a conceptualization of ASA. First-phase grouping processes are represented on the left with simultaneous and sequential grouping processes separately marked (bottom left box). Sequential grouping is based on predictions produced by representations encoding the previously detected acoustic regularities (upper left box).

Neural bases for detecting change and deviance

Possible neural correlates of the processes that are reviewed in the previous sections may be found in various stations of the auditory system. The ‘core’ auditory pathway (Figure 2) seems to keep a high-fidelity representation of sounds at least up to the level of the primary auditory cortex, although contributions to the buildup of streaming could occur as early as the cochlear nucleus [21]. In the primary auditory cortex itself, a number of response features may already encode information

Predictive regularity representations as perceptual objects

We have argued that auditory regularity representations supported by the SSA mechanism observable in many parts of the auditory system play an essential role in parsing complex auditory scenes. Here we examine whether regularity representations may form the core of auditory object representations. Recent theories of auditory object representation 34, 35 emphasize the requirement of common characteristics for object representations across different modalities. So, what do we expect of perceptual

Auditory object representations and attention

The hypothesis that auditory object representations are representations of the regularities linking together sounds forming a coherent sequence allows us to reexamine the long-standing debate in psychology regarding whether object formation requires focused attention 61, 62. Within the present framework, we should ask whether forming regularity representations requires attention. Several studies suggest that deviations from auditory regularities are detected even when attention is not focused

Conclusions

We have argued that predictive representations of temporal regularities constitute the core of auditory objects in the brain. This notion of auditory object formation is compatible with recent accounts of perception in other modalities 3, 70, with theories of motor control [74], and the interaction between motor control and perception [75]. Although there are several outstanding questions regarding the mechanisms underlying the proposed model (Box 3), it appears that predictive processing

Acknowledgements

Supported by the European Community's Seventh Framework Programme (grant no 231168 – SCANDLE; I.W. and S.D.) and by a grant of the Israeli Science Foundation (ISF) to I.N.

Glossary

Auditory Scene Analysis (ASA)
The process of analyzing a complex mixture of sounds to isolate the information relating to different sound sources.
Auditory streaming
A perceptual phenomenon in which a sequence of sounds is perceived as consisting of two or more auditory streams. When streaming occurs, perceivers experience difficulty in extracting inter-sound relationships across streams, such as the order between two sounds belonging to different streams.
Build-up of auditory streams
The perception

References (83)

  • R. Näätänen

    The mismatch negativity (MMN) in basic research of central auditory processing: A review

    Clin. Neurophysiol.

    (2007)
  • R. Takegata

    Pre–attentive representation of feature conjunctions for simultaneous, spatially distributed auditory objects

    Brain. Res. Cogn. Brain. Res.

    (2005)
  • O.A. Korzyukov

    Processing abstract auditory features in the human auditory cortex

    NeuroImage

    (2003)
  • R. Näätänen

    Primitive intelligence” in the auditory cortex

    Trends. Neurosci.

    (2001)
  • S. Pakarinen

    Measurement of extensive auditory discrimination profiles using mismatch negativity (MMN) of the auditory event-related potential

    Clin. Neurophysiol.

    (2007)
  • T. Baldeweg

    Repetition effects to sounds: Evidence for predictive coding in the auditory system

    Trends. Cogn. Sci.

    (2006)
  • A. Bendixen

    Rapid extraction of auditory feature contingencies

    NeuroImage.

    (2008)
  • K. Akatsuka

    Objective examination for two-point stimulation using a somatosensory oddball paradigm: an MEG study

    Clin. Neurophysiol.

    (2007)
  • C. Summerfield et al.

    Expectation (and attention) in visual cognition

    Trends. Cogn. Sci.

    (2009)
  • M. Kawato

    Internal models for motor control and trajectory planning

    Curr. Op. Neurobiol.

    (1999)
  • R.P. Carlyon

    How the brain separates sounds

    Trends. Cogn. Sci.

    (2004)
  • T. Kujala

    The mismatch negativity in cognitive and clinical neuroscience: theoretical and methodological considerations

    Biol. Psychol.

    (2007)
  • K. Friston et al.

    Cortical circuits for perceptual inference

    Neural Networks

    (2009)
  • R.L. Gregory

    Perceptions as hypotheses

    Philos. Trans. R Soc. Lond. B Biol. Sci.

    (1980)
  • M. Bar

    Visual objects in context

    Nat. Rev. Neurosci.

    (2004)
  • K. Friston

    A theory of cortical responses

    Philos. Trans R Soc. Lond. B Biol. Sci.

    (2005)
  • A.S. Bregman

    Auditory Scene Analysis

    (1990)
  • J.S. Snyder

    Effects of attention on neuroelectric correlates of auditory stream segregation

    J. Cogn. Neurosci.

    (2006)
  • C. Alain

    Neural activity associated with distinguishing concurrent auditory objects

    J. Acoust. Soc. Am.

    (2002)
  • W. Köhler

    Gestalt Psychology

    (1947)
  • I. Winkler

    Interpreting the mismatch negativity (MMN)

    J. Psychophysiol.

    (2007)
  • A. Bendixen

    I heard that coming: ERP evidence for stimulus driven prediction in the auditory system

    J. Neurosci.

    (2009)
  • T. Rahne et al.

    Neural representations of auditory input accommodate to the context in a dynamically changing acoustic environment

    Eur. J. Neurosci.

    (2009)
  • R. Näätänen

    Development of a memory trace for a complex sound in the human brain

    NeuroReport

    (1993)
  • I. Winkler

    Brain responses reveal the learning of foreign language phonemes

    Psychophysiol.

    (1999)
  • B.C.J. Moore et al.

    Factors influencing sequential stream segregation

    Acta Acust - Acust.

    (2002)
  • Y.I. Fishman

    Auditory stream segregation in monkey auditory cortex: effects of frequency separation, presentation rate, and tone duration

    J. Acoust. Soc. Am.

    (2004)
  • J.S. Snyder et al.

    Toward a neurophysiological theory of auditory stream segregation

    Psychol. Bull.

    (2007)
  • R. Cusack

    Effects of location, frequency region, and time course of selective attention on auditory scene analysis

    J. Exp. Psychol. Hum. Percept. Perform.

    (2004)
  • R. Näätänen et al.

    The N1 wave of the human electric and magnetic response to sound: A review and an analysis of the component structure

    Psychophysiol.

    (1987)
  • A. Fishbach

    Auditory edge detection: a neural model for physiological and psychoacoustical responses to amplitude transients

    J. Neurophysiol.

    (2001)
  • Cited by (418)

    View all citing articles on Scopus
    View full text