Elsevier

Cognition

Volume 187, June 2019, Pages 10-20
Cognition

Original Articles
Neural evidence for Bayesian trial-by-trial adaptation on the N400 during semantic priming

https://doi.org/10.1016/j.cognition.2019.01.001Get rights and content

Abstract

When semantic information is activated by a context prior to new bottom-up input (i.e. when a word is predicted), semantic processing of that incoming word is typically facilitated, attenuating the amplitude of the N400 event related potential (ERP) – a direct neural measure of semantic processing. N400 modulation is observed even when the context is a single semantically related “prime” word. This so-called “N400 semantic priming effect” is sensitive to the probability of encountering a related prime-target pair within an experimental block, suggesting that participants may be adapting the strength of their predictions to the predictive validity of their broader experimental environment. We formalize this adaptation using a Bayesian learning model that estimates and updates the probability of encountering a related versus an unrelated prime-target pair on each successive trial. We found that our model’s trial-by-trial estimates of target word probability accounted for significant variance in trial-by-trial N400 amplitude. These findings suggest that Bayesian principles contribute to how comprehenders adapt their semantic predictions to the statistical structure of their broader environment, with implications for the functional significance of the N400 component and the predictive nature of language processing.

Introduction

It has long been established that more predictable words are processed faster than less predictable words (e.g. Ehrlich and Rayner, 1981, Fischler and Bloom, 1979; see Staub, 2015 for a recent review). Rather than being all-or-nothing or strategic in nature, these effects of contextual predictability are graded, probabilistic and implicit (Luke and Christianson, 2016, Smith and Levy, 2013; see Kuperberg & Jaeger, 2016 for a review). Probabilistic prediction can aid language processing by alleviating the resource bottleneck that could otherwise occur at word onset (because some of the “work” of comprehension can be accomplished ahead of time, given the information provided in the context). Such benefits, however, require that prediction is based on probabilistic knowledge that approximates the statistical structure of the input. This presents a challenge for communication in the real world where our linguistic and non-linguistic environments often change. Each person we talk to and every book we read has its own unique set of syntactic and semantic preferences. Thus, in order for language comprehension to remain efficient, we must be able to adapt to these different environments so that our predictions continue to mirror their statistical structures. In the present study, we explore the close relationship between probabilistic prediction and adaptation in the brain by modeling a classic effect of adaptation on lexico-semantic processing: the influence of the predictive validity of the experimental environment on the N400 semantic priming effect.

The fundamental link between prediction and adaptation has been widely discussed in cognitive science, dating back to early models of animal learning (Pearce and Hall, 1980, Rescorla and Wagner, 1972). One way of formalizing this link is within a probabilistic generative framework (Griffiths, Chater, Kemp, Perfors, & Tenenbaum, 2010; see Perfors, Tenenbaum, Griffiths, & Xu, 2011, for an excellent introduction). Here, the agent’s overarching goal is to infer an underlying latent cause that best explains the statistics of its environmental input. As the agent receives more input (evidence), she is able to incrementally update her probabilistic beliefs using Bayes’ rule — a process known as belief updating.

In the domain of language, this type of probabilistic framework has most commonly been used to model incremental syntactic parsing (e.g. Levy, 2008), as well as to describe sentence comprehension more generally (Kuperberg, 2016, Kuperberg and Jaeger, 2016). In addition, it has recently been used to explain how we adapt to the broader set of statistical contingencies that are associated with, and define, any given situational context (e.g. Fine et al., 2010, Jaeger and Snider, 2013, Kleinschmidt and Jaeger, 2015, Myslin and Levy, 2016), where it is referred to as “rational” adaptation (see Anderson, 1990).1

Neural indices of online processing have shown similar effects of predictability as behavioral measures, suggesting that probabilistic prediction is instantiated in the brain during language comprehension. A well-established effect of contextual probability on language processing is on the N400 — an event-related potential (ERP) that peaks between 300 and 500 ms following the onset of an incoming word, and that is thought to reflect the ease of semantically processing that word (Federmeier, 2007, Kutas and Federmeier, 2011, Kutas and Hillyard, 1984). The N400 is highly sensitive to the semantic probability of incoming words (DeLong et al., 2005, Wlotko and Federmeier, 2012): its amplitude is less negative (“smaller”) to words that are semantically more (versus less) predictable. This is the case regardless of whether the context is a sentence stem (e.g. Kutas & Hillyard, 1984), a larger discourse or text (e.g. Van Berkum, Zwitserlood, Hagoort, & Brown, 2003), or a single ‘prime’ word (Bentin et al., 1985, Rugg, 1985).

There is also evidence that the amplitude of the N400 adapts to the statistics of its broader environment. A classic illustration of this is the effect of relatedness proportion on N400 modulation during a semantic priming paradigm (Brown et al., 2000, Holcomb, 1988, Lau et al., 2013). Behaviorally, the Relatedness Proportion effect on semantic priming was first described in the late 1970s by Tweedy, Lapinski, and Schvaneveldt (1977), and it has since been reported in numerous studies (reviewed by Neely, 1991). It refers to the finding that the semantic priming effect is larger in blocks that contain a higher (versus a lower) proportion of related (versus unrelated) prime-target pairs. The effect has long been linked to predictive mechanisms (Hutchison, 2007, Keefe and Neely, 1990, Neely and Keefe, 1989, Neely et al., 1989): in higher relatedness proportion blocks, participants are more likely to use the prime to generate stronger lexico-semantic predictions of the target.

Following these behavioral studies, as well as previous ERP experiments (Brown et al., 2000, Holcomb, 1988), we recently carried out an ERP study examining the effect of Relatedness Proportion on the N400 semantic priming effect (Lau et al., 2013). We measured ERPs as the same participants viewed the same core set of prime-target pairs, which were counterbalanced across two blocks. These blocks differed in the proportion of semantically related and unrelated word-pairs. In Block 1 (the lower relatedness proportion block), only 10% of the prime-target pairs were semantically related, and in Block 2 (the higher relatedness proportion block), 50% of the prime-target pairs were semantically related. Short breaks were given within both blocks as well as between blocks, and participants were not explicitly told that there would be any change between the blocks. We showed that the magnitude of the N400 semantic priming effect was significantly larger in Block 2 (the higher relatedness proportion block) than in Block 1 (the lower relatedness proportion block). In follow-up studies using MEG and fMRI, we also showed that the higher relatedness proportion block was associated with enhanced modulation of neuroanatomical regions sensitive to both lexico-semantic processing and learning (Lau et al., 2016, Weber et al., 2016).

These findings provide strong evidence that participants were able to implicitly adapt to the changes in the predictive validity across the two blocks (see Tweedy & Lapinski, 1981, for an early discussion of adaptation in relation to this effect). What remains unclear, however, is the time course and the computational principles underlying such adaptation in relation to prediction. In this investigation, we sought to address this question by building a computational model based on principles of rational (Bayesian) adaptation. This model computed and updated the probability of encountering target words on individual trials throughout Block 2 (the higher proportion block), with the assumption that participants had already seen Block 1 (the lower proportion block). We then use linear mixed effects regression to ask whether the trial-by-trial outputs of our computational model in each participant could explain changes in the trial-by-trial modulation of the actual N400 data collected in each participant throughout Block 2 in the dataset collected by Lau et al. (2013).

In the remainder of this paper, we describe the theory and mathematical computation of our model. We then give a brief overview of the experimental methods previously described in detail by Lau et al. (2013). We evaluate our model’s trial-by-trial output in each participant against the empirical trial-by-trial ERP data in each participant, and we then discuss our findings in the context of the broader literature on prediction, adaptation, and language processing.

Section snippets

Development of a rational probabilistic model of trial-by-trial adaptation

Our rational adaptor model considers how a comprehender makes probabilistic predictions during a semantic priming paradigm as she adapts to a higher relatedness proportion block (Block 2), following a lower relatedness proportion block (Block 1). By probabilistic prediction, we simply refer to the existence of a probability distribution over possible target words after seeing a prime on each trial.

To compute these probabilistic predictions on each trial, we assume that the agent is potentially

Experimental design

The experiment by Lau et al. (2013) crossed Relatedness (semantically related versus semantically unrelated word-pairs) and Relatedness Proportion (higher relatedness proportion versus lower relatedness proportion block). The related word-pairs had an FAS of 0.5 or higher (mean FAS: 0.65) as estimated using the University of South Florida Free Association Norms (Nelson et al., 2004), and the unrelated word-pairs were created by randomly redistributing the primes across the target items and

Participants and ERP data collection

Details about participants and ERP data collection have been previously described in detail by Lau et al. (2013), and are summarized below.

Participants were all right-handed native speakers of American English recruited from Tufts University. All gave written informed consent to participate. Data were originally collected from 33 participants (19 women; mean age = 20.5 years) and two were omitted due to artifacts. All participants saw the lower relatedness proportion block first (Block 1),

Visualization of trial-by-trial ERP data and model predictions

In order to visualize the changes in N400 amplitude over target items in Block 2, without assuming any particular parameters of the adaptation, we conducted a loess local regression over N400 amplitudes for related and unrelated words across the ordinal position of critical items in the experiment. The N400 amplitudes evoked by related and unrelated critical targets in Block 2 are shown in Fig. 2. As can be seen, the amplitude of the N400 evoked by related and unrelated targets were initially

Discussion

It is well established that the magnitude of the behavioral semantic priming effect is sensitive to the predictive validity of the broader experimental environment (Neely, 1991, Tweedy et al., 1977). This effect of predictive validity also influences the modulation of the N400 ERP component — a direct neural index of semantic processing (Kutas & Federmeier, 2011): when the proportion of related word-pairs within an experimental block increases, the N400 priming effect increases (Brown et al.,

Conclusion

In conclusion, our quantitative model of trial-by-trial adaptation on the N400 ERP component provides evidence that (1) the brain combines immediate contextual constraints with global probabilistic constraints to influence semantic processing of incoming words, (2) the brain has some prior expectation that the broad statistical structure of its environment might change and is able to rationally adapt its probabilistic semantic predictions of incoming words in response to this new environment.

Of

Acknowledgements

The authors thank Eric Fields and Sorabh Kothari for their assistance with data collection, and Trevor Brothers for his helpful comments on the manuscript. This work was funded by the National Institute of Mental Health (R01MH071635 to GRK), National Institute of Child Health and Human Development (F32HD063221 to EFL and R01HD082527 to GRK), National Science Foundation (SPRF-FR 1715072 to EM), and the Sidney R. Baer Jr. Foundation (to GRK).

References (91)

  • T.L. Griffiths et al.

    Probabilistic models of cognition: Exploring representations and inductive biases

    Trends in Cognitive Sciences

    (2010)
  • P.J. Holcomb

    Automatic and attentional processing: An event-related brain potential analysis of semantic priming

    Brain and Language

    (1988)
  • T.F. Jaeger et al.

    Alignment as a consequence of expectation adaptation: Syntactic priming is affected by the prime’s prediction error given both prior and recent experience

    Cognition

    (2013)
  • Y. Kamide

    Learning individual talkers' structural preferences

    Cognition

    (2012)
  • D.C. Knill et al.

    Do humans optimally integrate stereo and texture information for judgments of surface slant?

    Vision Research

    (2003)
  • R. Levy

    Expectation-based syntactic comprehension

    Cognition

    (2008)
  • S.G. Luke et al.

    Limits on lexical prediction during reading

    Cognitive Psychology

    (2016)
  • M. Myslin et al.

    Comprehension priming as rational expectation for repetition: Evidence from syntactic processing

    Cognition

    (2016)
  • J.H. Neely et al.

    Semantic context effects on visual word processing: A hybrid prospective-retrospective processing theory

  • M.S. Nieuwland et al.

    On the incrementality of pragmatic processing: An ERP investigation of informativeness and pragmatic abilities

    Journal of Memory and Language

    (2010)
  • D. Norris et al.

    Perceptual learning in speech

    Cognitive Psychology

    (2003)
  • G. Orbán et al.

    Neural variability and sampling-based probabilistic representations in the visual cortex

    Neuron

    (2016)
  • A. Perfors et al.

    A tutorial introduction to Bayesian models of cognitive development

    Cognition

    (2011)
  • T. Qian et al.

    Incremental implicit learning of bundles of statistical patterns

    Cognition

    (2016)
  • N.J. Smith et al.

    The effect of word predictability on reading time is logarithmic

    Cognition

    (2013)
  • J.J.A. Van Berkum et al.

    When and how do listeners relate a sentence to the wider discourse? Evidence from the N400 effect

    Cognitive Brain Research

    (2003)
  • J. Vroomen et al.

    Selective adaptation and recalibration of auditory speech by lipread information: Dissipation

    Speech Communication

    (2004)
  • E.W. Wlotko et al.

    So that's what you meant! Event-related potentials reveal multiple aspects of context use during construction of message-level meaning

    NeuroImage

    (2012)
  • A.J. Yu et al.

    Uncertainty, neuromodulation, and attention

    Neuron

    (2005)
  • J.R. Anderson

    The adaptive character of thought

    (1990)
  • D.M. Bates et al.

    Fitting linear mixed-effects models using lme4

    Journal of Statistical Software

    (2015)
  • T.E.J. Behrens et al.

    Learning the value of information in an uncertain world

    Nature Neuroscience

    (2007)
  • Brothers, T., Hoversten, L., Dave, S., Traxler, M. J., & Swaab, T. (under review). Flexible predictions during...
  • M. Brysbaert et al.

    Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English

    Behavioral Research Methods

    (2009)
  • F. Chang et al.

    Becoming syntactic

    Psychological Review

    (2006)
  • A. Clark

    Whatever next? Predictive brains, situated agents, and the future of cognitive science

    Behavioral and Brain Sciences

    (2013)
  • S. Coulson et al.

    Right hemisphere sensitivity to word- and sentence-level context: Evidence from event-related brain potentials

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (2005)
  • G.S. Dell et al.

    The P-chain: Relating sentence production and its disorders to comprehension and acquisition

    Philosophical Transactions of the Royal Society B: Biological Sciences

    (2014)
  • K.A. DeLong et al.

    Probabilistic word pre-activation during language comprehension inferred from electrical brain activity

    Nature Neuroscience

    (2005)
  • M.O. Ernst et al.

    Humans integrate visual and haptic information in a statistically optimal fashion

    Nature

    (2002)
  • K.D. Federmeier

    Thinking ahead: The role and roots of prediction in language comprehension

    Psychophysiology

    (2007)
  • H. Feldman et al.

    Attention, uncertainty, and free-energy

    Frontiers in Human Neuroscience

    (2010)
  • Fine, A. B., Qian, T., Jaeger, T. F., & Jacobs, R. A. (2010). Is there syntactic adaptation in language comprehension?...
  • D.J. Foss et al.

    Great expectations: Context effects during sentence processing

  • S.L. Frank et al.

    Word predictability and semantic similarity show distinct patterns of brain activity during language comprehension

    Language, Cognition and Neuroscience

    (2017)
  • Cited by (47)

    View all citing articles on Scopus
    View full text