Elsevier

NeuroImage

Volume 106, 1 February 2015, Pages 222-237
NeuroImage

A computational analysis of the neural bases of Bayesian inference

https://doi.org/10.1016/j.neuroimage.2014.11.007Get rights and content

Highlights

  • We apply Bayesian model comparison to neural correlates of Bayesian inference.

  • We analyze single-trial EEG dynamics during performance on an urn–ball task.

  • We find dissociable anterior (P3a) and posterior (P3b, Slow Wave) correlates.

  • Our data are consistent with nonlinear probability weighting.

  • Our results are consistent with the Bayesian brain hypothesis.

Abstract

Empirical support for the Bayesian brain hypothesis, although of major theoretical importance for cognitive neuroscience, is surprisingly scarce. This hypothesis posits simply that neural activities code and compute Bayesian probabilities. Here, we introduce an urn–ball paradigm to relate event-related potentials (ERPs) such as the P300 wave to Bayesian inference. Bayesian model comparison is conducted to compare various models in terms of their ability to explain trial-by-trial variation in ERP responses at different points in time and over different regions of the scalp. Specifically, we are interested in dissociating specific ERP responses in terms of Bayesian updating and predictive surprise. Bayesian updating refers to changes in probability distributions given new observations, while predictive surprise equals the surprise about observations under current probability distributions. Components of the late positive complex (P3a, P3b, Slow Wave) provide dissociable measures of Bayesian updating and predictive surprise. Specifically, the updating of beliefs about hidden states yields the best fit for the anteriorly distributed P3a, whereas the updating of predictions of observations accounts best for the posteriorly distributed Slow Wave. In addition, parietally distributed P3b responses are best fit by predictive surprise. These results indicate that the three components of the late positive complex reflect distinct neural computations. As such they are consistent with the Bayesian brain hypothesis, but these neural computations seem to be subject to nonlinear probability weighting. We integrate these findings with the free-energy principle that instantiates the Bayesian brain hypothesis.

Introduction

How can the brain make reliable and valid inferences about the external world based on variable sensory information? Bayesian decision theory offers a useful theoretical framework for explaining this inference process (Jaynes, 2003, Robert, 2007). Bayesian inference constantly updates prior beliefs to posterior beliefs in light of observed data according to probability rules (Bayes' theorem; Baldi and Itti (2010)). Thus, it can hardly surprise that a hypothesis has been proposed according to which the brain codes and computes Bayesian probabilities (Knill and Pouget, 2004, Friston, 2005, Doya et al., 2007, Gold and Shadlen, 2007, Kopp, 2008). While earlier research provided results that are consistent with the Bayesian brain hypothesis (Hampton et al., 2006, Ostwald et al., 2012, Vilares et al., 2012, Lieder et al., 2013), no agreed-upon conclusion about the utility of the Bayesian brain hypothesis as a theoretical framework for explaining cognitive functions of the brain has been achieved in the field (Clark, 2013).

In order to test the Bayesian brain hypothesis, we tried to explain event related brain potentials (ERPs) during successive trials in an urn–ball task (Phillips and Edwards, 1966) in terms of underlying probability distributions. Specifically, we were interested in dissociating temporally and regionally specific cortical responses in terms of Bayesian updating and predictive surprise. To introduce experimental variance in terms of surprise-related responses, we manipulated the task's probabilistic contingencies at two levels. First, we introduced uncertainty about the sort of urn containing balls (by sampling urns from two distributions). Second, we manipulated the proportion of ball colors within each urn. We will refer to these probabilistic contingencies as prior probabilities and likelihoods, respectively. Notice that the subject inferred the nature of the urn based on small samples of balls that were drawn from the sampled urn.

Cognitive processes related to Bayesian updating can be split into two sub-processes: (1) Bayesian surprise represents the change in beliefs about hidden states given current observations. Bayesian surprise is computationally expressed as the divergence between the prior probability distribution over states and the posterior probability distribution over states given current observations; we henceforth label probability distributions over states as belief distributions. (2) Postdictive surprise represents the change in predictions of future observations given current observations. Postdictive surprise is computationally expressed as the divergence between the prior prediction distribution over observations and the posterior prediction distribution over observations given current observations. Notice that the probability distributions over observations (or prediction distributions) are dependent on the belief distributions such that both, the belief distributions as well as the prediction distributions, are dependent on Bayes' theorem. Thus, postdictive surprise represents an aspect of Bayesian updating which is dependent on, yet computationally distinguishable from, Bayesian surprise. Bayesian surprise and postdictive surprise are both fundamentally different from predictive surprise that is simply the surprise about current observations under the current prediction distribution. Another way to grasp these distinctions is that Bayesian surprise refers to probability distributions (belief distributions) over unobservable random variables (here hidden states or urns), whereas postdictive surprise and predictive surprise refer – in different ways and as explained above – to probability distributions (prediction distributions) over observed random variables (here observed balls).

We also examined the possibility that electrophysiological substrates of Bayesian inference may be better explained when nonlinear weighting of probabilities is taken into account. Nonlinear probability weighting was originally conjectured by prospect theory which represents a successful descriptive theory of economic decision behavior (Kahneman and Tversky, 1979, Tversky and Kahneman, 1992, Fox and Poldrack, 2009). In addition, nonlinear likelihood weighting has been applied in technical systems for audio–visual speech recognition (Neti et al., 2000). We addressed the issue of nonlinear probability weighting by repeating the above comparisons (of Bayesian updating and predictive surprise) using Bayesian observer models with and without nonlinear weighting.

We recorded 20-channel electroencephalographic data from human participants who performed our urn–ball paradigm. Our analyses focused on the late positive complex that is known to be decomposable into three separable ERP components (Sutton and Ruchkin, 1984, Dien et al., 2004): First, the late positive complex incorporates the anteriorly distributed P3a component that usually occurs at latencies around 340 ms post-stimulus (Kopp and Lange, 2013). Second, the late positive complex comprises the parietally distributed P3b component at, in comparison to the P3a latency, variably delayed latencies (Kolossa et al., 2012). Notice that the functional significance of these two ERP components seems to be related to uncertainty (Sutton et al., 1965, Kopp and Lange, 2013), surprise (Donchin, 1981, Kolossa et al., 2012), decision-making (O'Connell et al., 2012, Kelly and O'Connell, 2013) and – putatively – Bayesian inference (Friston, 2005, Kopp, 2008). Third, a posteriorly distributed Slow Wave (SW) accomplishes the late positive complex whose functional significance is, however, comparably less well understood (but see Ruchkin et al. (1988); García-Larrea and Cézanne-Bert (1998); Spencer et al. (2001); Matsuda and Nittono (2014)).

The present study aims at contributing to the literature by applying Bayesian model comparison to discuss Bayesian updating and predictive surprise in terms of their ability to explain trial-by-trial ERP variability at different points in time and over different regions of the scalp. Notice that this is a meta-Bayesian analysis, in the sense that we are using Bayesian model comparison to select among various regressors that are all based on the assumption of an ideal Bayesian observer (Daunizeau et al., 2010, Lieder et al., 2013). Thus, the present study should be regarded as a challenge to the validity of the Bayesian brain hypothesis as a theoretical framework for cognitive functions of the brain. One of us had proposed – on purely theoretical grounds – close relationships between cortical P3 responses and Bayesian inference several years ago (Kopp, 2008). We were further interested whether a Bayesian observer model that incorporates nonlinear probability weighting would outperform an unweighted model when explaining the measured cortical responses.

Notice that our earlier empirical research forms the basis of more specific hypotheses that could be examined in the context of the urn–ball task. This urn–ball task was especially designed for this purpose. The first hypothesis proposes that ERP responses at fronto-central channels (P3a) are best explained by Bayesian surprise. This suggestion has been made by Kopp and Lange (2013) by referring to Sokolov's (1966) Bayesian model of the orienting response. Notice that the P3a component of the ERP is usually considered as indicating the brain's orienting response (e.g., Friedman et al., 2001, Barceló et al., 2002, Nieuwenhuis et al., 2011), and Kopp and Lange's (2013) P3a data were consistent with Sokolov's (1966) model of the orienting response. The second hypothesis postulates that ERP responses at centro-parietal channels (P3b) are best explained by predictive surprise. This suggestion was made by Kolossa et al. (2012) who showed that trial-by-trial P3b amplitude fluctuations in a simple two-choice response time task could be best explained by a computational model of predictive surprise. Assuming that these two hypotheses possess some validity, a novel hypothesis arises according to which ERP responses at occipito-parietal channels (SW, Matsuda and Nittono (2014)) would be best explained by postdictive surprise.

Section snippets

Participants

Sixteen undergraduate psychology students participated to gain course credits (15 females, 1 male). Their age ranged from 19 to 50 years (M = 24.7; SD = 9.3 years of age). Handedness was examined with the Edinburgh Handedness Inventory (Oldfield, 1971), revealing that one participant was left-handed and two were ambidextrous. All participants indicated having normal or corrected-to-normal sight. The procedure was approved by the local Ethics Committee.

Experimental design

Our urn–ball task represents a modification of

Behavioral results and hyper-parameter fitting

Fig. 6 shows the likelihoods (ratios of choices) Pc=u=1P¯u=14 across participants for choice c being urn type u = 1 depending on the mean posterior probability P¯u=14 after observing an episode of sampling. The mean posterior probabilities were calculated over all sequences containing identical ratios of types of ball colors. The general form of the likelihood function in a binary decision task is Poptc=1P¯u=14=11+eaP¯u=14b (Daunizeau et al., 2014). We assume that the decision depends purely on

Discussion

This study explored neural correlates of Bayesian inference by combining an urn–ball paradigm (Fig. 1) with computational modeling of trial-by-trial electrophysiological signals. Our approach led to the discovery that dissociable cortical signals seem to code and compute distinguishable aspects of Bayes-optimal probabilistic inference. Thus, we isolated discrete ERP components which could be dissociated with regard to their putative function in accomplishing Bayesian inference (cf. Fig. 8,

Acknowledgments

This work was supported by the Deutsche Forschungsgemeinschaft (DFG) via a Future Fund of Technische Universität Braunschweig (725 508 02). Thanks are due Dirk Ostwald, Max-Planck-Institute for Human Development, Berlin, Germany, for providing a software implementation of his model.

References (97)

  • R. Gonzalez et al.

    On the shape of the probability weighting function

    Cogn. Psychol.

    (1999)
  • G. Gratton et al.

    A new method for off-line removal of ocular artifact

    Electroencephalogr. Clin. Neurophysiol.

    (1983)
  • D.M. Grether

    Testing Bayes rule and the representativeness heuristic: some experimental evidence

    J. Econ. Behav. Organ.

    (1992)
  • L. Itti et al.

    Bayesian surprise attracts human attention

    Vis. Res.

    (2009)
  • D.C. Knill et al.

    The Bayesian brain: the role of uncertainty in neural coding and computation for perception and action

    Trends Neurosci.

    (2004)
  • R.C. Oldfield

    The assessment and analysis of handedness: the Edinburgh inventory

    Neuropsychologia

    (1971)
  • D. Ostwald et al.

    Evidence for neural encoding of Bayesian surprise in human somatosensation

    NeuroImage

    (2012)
  • W.D. Penny

    Comparing dynamic causal models using AIC, BIC and free energy

    NeuroImage

    (2012)
  • W.D. Penny et al.

    Comparing dynamic causal models

    NeuroImage

    (2004)
  • J. Polich

    Updating P300: an integrative theory of P3a and P3b

    Clin. Neurophysiol.

    (2007)
  • K. Preuschoff et al.

    Neural differentiation of expected reward and risk in human subcortical structures

    Neuron

    (2006)
  • K.E. Stephan et al.

    Comparing hemodynamic models with DCM

    NeuroImage

    (2007)
  • K.E. Stephan et al.

    Bayesian model selection for group studies

    NeuroImage

    (2009)
  • B.A. Strange et al.

    Information theory, novelty and hippocampal responses: unpredicted or unpredictable?

    Neural Netw.

    (2005)
  • C. Summerfield et al.

    A neural representation of prior information during perceptual inference

    Neuron

    (2008)
  • C. Trepel et al.

    Prospect theory on the brain? Toward a cognitive neuroscience of decision under risk

    Cogn. Brain Res.

    (2005)
  • I. Vilares et al.

    Differential representations of prior and likelihood uncertainty in the human brain

    Curr. Biol.

    (2012)
  • H. Zhang et al.

    Ubiquitous log odds: a common representation of probability and frequency distortion in perception, action, and cognition

    Front. Neurosci.

    (2012)
  • A. Achtziger et al.

    The neural basis of belief updating and rational decision making

    Soc. Cogn. Affect. Neurosci.

    (2014)
  • D.R. Bach et al.

    Knowing how much you don't know: a neural organization of uncertainty estimates

    Nat. Rev. Neurosci.

    (2012)
  • F. Barceló et al.

    Think differently: a brain orienting response to task novelty

    NeuroReport

    (2002)
  • G.A. Barnard

    Statistical inference

    J. R. Stat. Soc. Ser. B

    (1949)
  • R.J. Barry et al.

    An orienting reflex perspective on anteriorisation of the P3 of the event-related potential

    Exp. Brain Res.

    (2006)
  • P. Bossaerts

    Risk and risk prediction error signals in anterior insula

    Brain Struct. Funct.

    (2010)
  • D.R. Cavagnaro et al.

    Discriminating among probability weighting functions using adaptive design optimization

    J. Risk Uncertain.

    (2013)
  • A. Clark

    Whatever next? Predictive brains, situated agents, and the future of cognitive science

    Behav. Brain Sci.

    (2013)
  • M. d'Acremont et al.

    The human brain encodes event frequencies while forming subjective beliefs

    J. Neurosci.

    (2013)
  • J. Daunizeau et al.

    Observing the observer (II): deciding when to decide

    PLoS One

    (2010)
  • J. Daunizeau et al.

    VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data

    PLoS Comput. Biol.

    (2014)
  • P. Dayan et al.

    The Helmholtz machine

    Neural Comput.

    (1995)
  • F.P. de Lange et al.

    Accumulation of evidence during sequential decision making: the importance of top-down factors

    J. Neurosci.

    (2010)
  • J. Dien et al.

    Parsing the late positive complex: mental chronometry and the ERP components that inhabit the neighborhood of the P300

    Psychophysiology

    (2004)
  • E. Donchin

    Surprise! Surprise?

    Psychophysiology

    (1981)
  • E. Donchin et al.

    Is the P300 component a manifestation of context updating?

    Behav. Brain Sci.

    (1988)
  • K. Doya et al.

    Bayesian Brain: Probabilistic Approaches to Neural Coding

    (2007)
  • C.D. Fiorillo

    Beyond Bayes: on the need for a unified and Jaynesian definition of probability and information within neuroscience

    Information

    (2012)
  • J.R. Folstein et al.

    Influence of cognitive control and mismatch on the N2 component of the ERP: a review

    Psychophysiology

    (2008)
  • K.J. Friston

    A theory of cortical responses

    Philos. Trans. R. Soc. B-Biol. Sci.

    (2005)
  • Cited by (0)

    View full text