Elsevier

Brain and Language

Volume 129, February 2014, Pages 14-23
Brain and Language

The role of morphology in phoneme prediction: Evidence from MEG

https://doi.org/10.1016/j.bandl.2013.11.004Get rights and content

Highlights

  • We investigate the role of prediction error and competition in spoken word recognition.

  • We explore how predictive processes interact with morphological structure.

  • Information-theoretic measures of prediction correlated with MEG signal in auditory cortex.

  • Morphologically complex words showed a stronger effect of surprisal.

  • This result argues for an active role for morphology in spoken word recognition.

Abstract

There is substantial neural evidence for the role of morphology (word-internal structure) in visual word recognition. We extend this work to auditory word recognition, drawing on recent evidence that phoneme prediction is central to this process. In a magnetoencephalography (MEG) study, we crossed morphological complexity (bruis-er vs. bourbon) with the predictability of the word ending (bourbon vs. burble). High prediction error (surprisal) led to increased auditory cortex activity. This effect was enhanced for morphologically complex words. Additionally, we calculated for each timepoint the surprisal corresponding to the phoneme perceived at that timepoint, as well as the cohort entropy, which quantifies the competition among words compatible with the string prefix up to that timepoint. Higher surprisal increased neural activity at the end of the word, and higher entropy decreased neural activity shortly after word onset. These results reinforce the role of morphology and phoneme prediction in spoken word recognition.

Introduction

Recent work has illuminated the role of morphology in visual word recognition. Evidence from both behavioral and brain-based studies strongly indicates that visually presented words are decomposed into morphemes based on their visual forms, and that this visual decomposition feeds lexical access for the lexical information associated with morphemes (Fiorentino and Poeppel, 2007a, Fiorentino and Poeppel, 2007b, Rastle et al., 2004, Solomyak and Marantz, 2010, inter alia). Recognition of visually complex words, then, follows the decomposition, look-up, and recomposition model championed by Taft and others based on behavioral data (Taft, 2004, Taft and Forster, 1975).

The role of morphological structure in auditory word recognition is less studied and less well understood, though it has been known for some time that morphological structure plays a role in auditory word recognition as well (Marslen-Wilson, Tyler, Waksler, & Older, 1994). Recent studies by Baayen, Wurm, and colleagues point to a predictive role for morpheme recognition during auditory word recognition (Balling and Baayen, 2008, Balling and Baayen, 2012, Wurm, 1997, Wurm et al., 2006). In particular, Balling and Baayen, 2008, Balling and Baayen, 2012 contrast two general models of recognition. On one, all of the full words consistent with the auditory input, the full word cohort of the word eventually recognized, are activated during recognition, with their competing representations compared against the incoming acoustic signal (Marslen-Wilson, 1987, Marslen-Wilson and Welsh, 1978). On this class of models, the morphological decomposition of members of the cohort would not be relevant to cohort competition and would not interact with prediction of upcoming phonemes from the cohort consistent with the already processed phonemes. A second model would leave a role for recognition of component morphemes of the word being processed (Balling and Baayen, 2008, Balling and Baayen, 2012). Recognition of a morpheme, for example a morphological prefix or a stem, would yield predictions for upcoming morphemes, and upcoming phonemes as part of these morphemes. We test the hypothesis that this additional prediction associated with morphological structure might enhance the prediction of upcoming phonemes based on the cohort of full words consistent with the input.

In addition to questions associated with the role of morphology in auditory word recognition, the actual mechanisms, both cognitive and neural, whereby a cohort of possible words influences prediction and processing of the incoming speech stream has also been addressed in the recent literature. On the one hand, one might endorse a competition model in which all members of a cohort are activated to an extent proportional to their frequency of occurrence, with lateral competition between activated cohort members (Marslen-Wilson, 1987, Marslen-Wilson and Welsh, 1978). On this view, the more members of a cohort and the more evenly distributed their frequency, the more cognitive and neural activity associated with activation and active inhibition. In a recent paper, Gagnepain, Henson, and Davis (2012) suggest that cohort competition might not be a main driver of brain activity associated with auditory word recognition. Rather, they suggest, the unexpectedness of the incoming input given the probability distribution over possible continuations from what has already been processed is the factor that drives neural activity and affects response times. They suggest that activity from areas around auditory cortex reflect the mismatch between predicted and incoming phonemic material—a prediction error signal. Their results are also consistent with the activity reflecting surprisal, which quantifies the change in the probability distribution over the members of the active cohort based on the incoming stimulus (Hale, 2001).

For the role of morphology in auditory word recognition, Balling and Baayen (2012) propose that online morphological decomposition leads to prediction, for stems from morphological prefixes and for suffixes from stems, that is observable in reaction times in lexical decision. All other things being equal, being morphologically complex aids in auditory word recognition over suitably matched monomorphemic words (Balling and Baayen, 2008, Ji et al., 2011). From Gagnepain et al. (2012), we derive the hypothesis that cohorts of words consistent with auditory input show their face neurally not through competition, where higher entropy among members of a cohort would lead to more activity, but through surprisal, where changes in the probability distribution over cohort members, with new input, might drive activity—or through prediction error, where observed lower probability continuations conflict with their higher probability cohort members. From neural and behavioral work, we might be led to an opposite conclusion about cohort competition at the beginning of the recognition of auditorily presented words: the higher the entropy over the cohort, the less activation we will observe (Baayen et al., 2007, Linzen et al., 2013, Wurm et al., 2006). On this view, which we will refer to as the Low-Entropy Dependent Prediction model (LEDP), higher entropy prevents commitment to prediction for upcoming input, leading to less predictive work and less neural activity, while low entropy allows more commitment to continuations, thus more work and more activity.

The present study is a preliminary look at whether contemporary magnetoencephalography (MEG) measurement and analysis techniques are suitable for investigating the role of morphology and of cohort entropy online, as participants listen to spoken language. In particular, we ask whether morphological complexity enhances the prediction of upcoming phonemic material such that we see more neural activity associated with surprisal during auditory word processing for morphologically complex as opposed to monomorphemic words.

To approach these questions we manipulated morphological complexity and surprisal in a 2-by-2 factorial design. The morphological complexity manipulation consisted of two morpheme conditions: monomorphemic and bimorphemic words. The surprisal manipulation consisted of two continuation surprisal conditions: high- and low-surprisal continuations of a shared string prefix. For bimorphemic pairs (e.g., bruises/bruiser), this string prefix consisted of a morphological stem (bruis-), and the continuations consisted of differing suffixes (-es/-er). For monomorphemic pairs (e.g., bourbon/burble), this shared string prefix consisted of phonological material that did not constitute a morpheme ([bɹb]), and the continuations were the final phonemes of the words, which similarly were not morphemes (e.g., -ən/-əl).

Each participant heard both words in each pair. This introduced a third variable: order of presentation within pair. Given evidence of long-distance morphological priming in auditory word recognition (Kouider & Dupoux, 2009), this variable is of particular note for bimorphemic pairs. As such, the ordering position of a stimulus relative to its pair counterpart was also taken into account in the analysis.

In addition to the factorial design described above, we also investigate whether we can detect neural activity related to individual phoneme prediction on a trial-by-trial basis (Gagnepain et al., 2012). We used a frequency database to calculate the probability distribution of all English words compatible with the string prefix at each time point in each trial. This enabled us to derive millisecond-by-millisecond estimates of two information-theoretic quantities: phoneme surprisal and cohort entropy.

The surprisal of the phoneme that is currently being heard is the inverse of the log of its conditional probability given the phonemes that preceded it. This probability can be calculated by dividing the total frequency of the present cohort by the total frequency of the cohort that was “alive” prior to hearing the current phoneme. Formally, if PrefFreq(x) is the summed frequency of all words that start with the phoneme sequence x, then the surprisal of the third phoneme u in bruiser ([bruzr]) would be given by-log2(PrefFreq(bru)/PrefFreq(br)).

The cohort entropy is the entropy of the probability distribution over all words that are compatible with the string prefix heard thus far. If C is the cohort, f(w) is the frequency of a word w, and FC is the total frequency of the cohort, then the cohort entropy is given by:-wCf(w)FClog2f(w)FC.

We predict that higher phoneme surprisal should lead to increased neural activity. We also investigate whether we can observe an effect of cohort entropy, and, if so, whether the effect is such that neural activity increases with increased entropy, as predicted by competition models, or decreases with increased entropy, as predicted by the LEDP.

A third variable that we calculated was cohort frequency, which is the summed frequency of all the words in the cohort. The prior literature does not afford a specific hypothesis about the potential significance of this variable. However, in light of the pervasiveness of frequency effects in language processing, we examine the effect of this variable informally, and leave an in-depth investigation of its significance for future research.

In light of prior work suggesting a key role of error detection/surprisal in auditory processing, as well as work suggesting a predictive role for morpheme recognition in auditory processing, we expect to observe an effect of surprisal which is enhanced in morphologically complex words, relative to simple words. This interaction with morphological complexity should apply to the categorical variable of continuation surprisal, as well as to any effects of phoneme-by-phoneme surprisal.

We furthermore test for facilitatory effects of entropy early in the stimulus, as predicted by LEDP models. While cohort competition models predict an inhibitory effect of entropy during word comprehension, LEDP models would predict facilitation near word onset, when entropy is high.

In accordance with the results of Gagnepain et al. (2012), we expect to see effects of surprisal in auditory regions such as the superior temporal gyrus (STG) and transverse temporal gyrus (TTG). Given that morphological decomposition involves accessing lexical entries, we may also expect to see evidence of the predicted interaction with morphological complexity in the middle temporal gyrus (MTG). These will serve as our three regions of interest (ROIs) for the analysis.

Section snippets

Design and stimuli

The experiment consisted of an auditory lexical decision task, with simultaneous MEG recording of the magnetic fields induced by electrical activity in the brain. The factorial design included two two-level stimulus variables of interest: morphological complexity (bimorphemic and monomorphemic) and continuation surprisal (high- and low-surprisal continuations).

Stimulus words were chosen in pairs from the English Lexicon Project (ELP) (Balota et al., 2007). For bimorphemic words, continuations

Regions of interest analysis

Analysis was conducted on three ROIs in the temporal lobe: the transverse temporal gyrus (auditory cortex), superior temporal gyrus, and middle temporal gyrus. The anatomical FreeSurfer labels (Desikan et al., 2006) corresponding to these regions served as the ROIs for the analysis. The inverse solution over all trials was calculated within the target label of each individual subject. The transverse temporal, superior temporal, and middle temporal ROIs, and grand average activation (at around

Behavioral

Both accuracy and reaction time (RT) showed significant effects of morphological complexity (RT, p = .002; accuracy, p < .001) and continuation surprisal (RT, p < .001; accuracy, p < .001), such that monomorphemic words were responded to more slowly and less accurately than bimorphemic words, and high surprisal items were responded to more slowly and less accurately than low surprisal items. No significant interactions emerged between morphological complexity and continuation surprisal (RT, p = .124;

Discussion

This study attempted to address a number of questions, both theoretical and methodological. On the methodological level, the study sought to explore the effectiveness of MEG for investigating the role of morphology in phoneme prediction during auditory word recognition, as well as the effectiveness of millisecond-by-millisecond correlation of MEG data with information-theoretic variables time-locked to phoneme boundaries within stimuli.

Our results suggest positive answers to both of these

Conclusion

Recent work in auditory word recognition has provided evidence for phoneme-level prediction occurring during the processing of a spoken word. Additional work points to an interaction of these predictive processes with a word’s morphological structure. Drawing on these two lines of research, we investigated the role of morphological structure in phoneme prediction, while additionally exploring the precise manner in which the cohort of words consistent with the input affects the neural processes

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. BCS-0843969 and by the NYU Abu Dhabi Research Council under Grant G1001 from the NYUAD Institute, New York University Abu Dhabi. We thank Jeff Walker for technical assistance.

References (42)

  • W.D. Marslen-Wilson et al.

    Processing interactions and lexical access during word recognition in continuous speech

    Cognitive Psychology

    (1978)
  • J.L. McClelland et al.

    The TRACE model of speech perception

    Cognitive Psychology

    (1986)
  • R.C. Oldfield

    The assessment and analysis of handedness: The Edinburgh inventory

    Neuropsychologia

    (1971)
  • O. Solomyak et al.

    Lexical access in early stages of visual word processing: A single-trial correlational MEG study of heteronym recognition

    Brain and Language

    (2009)
  • M. Taft et al.

    Lexical storage and retrieval of prefixed words

    Journal of Verbal Learning and Verbal Behavior

    (1975)
  • L.H. Wurm

    Auditory processing of prefixed English words is both continuous and decompositional

    Journal of Memory and Language

    (1997)
  • Y. Adachi et al.

    Reduction of non-periodic environmental magnetic noise in MEG measurement by continuously adjusted least squares method

    IEEE Transactions on Applied Superconductivity

    (2001)
  • R.H. Baayen et al.

    Lexical dynamics for low-frequency complex words: A regression study across tasks and modalities

    The Mental Lexicon

    (2007)
  • L.W. Balling et al.

    Morphological effects in auditory word recognition: Evidence from Danish

    Language and Cognitive Processes

    (2008)
  • L.W. Balling et al.

    Probability and surprisal in auditory comprehension of morphologically complex words

    Cognition

    (2012)
  • D.A. Balota et al.

    The English lexicon project

    Behavior Research Methods

    (2007)
  • Cited by (45)

    • Magnetoencephalography and Language

      2020, Neuroimaging Clinics of North America
    • Left occipital and right frontal involvement in syntactic category prediction: MEG evidence from Standard Arabic

      2019, Neuropsychologia
      Citation Excerpt :

      The accumulated evidence shows that predictive processes operate on several levels of linguistic representation. Behavioral (e.g., Warren, 1970 — ‘phoneme restoration’) and neurophysiological work (e.g., Gagnepain et al., 2012; Ettinger et al., 2014; Ylinen et al., 2016; see also Gwilliams and Marantz, 2015, for work on Standard Arabic) has provided evidence for predictive processing at the phonological level. There is also a considerable amount of evidence for prediction generation at the lexical level (e.g., Kutas and Hillyard, 1984; Lau et al., 2013; Lau et al., 2016; Maess et al., 2016b), showing that the degree to which a target word is predictable modulates brain activation levels as measured by electro- and magnetoencephalography (EEG and MEG, respectively), both before and after the onset of the target word.

    • Morphological processing without semantics: An ERP study with spoken words

      2019, Cortex
      Citation Excerpt :

      This larger negativity (resembling the N400) in the illegal nonword condition was interpreted as reflecting a more difficult lexical-semantic integration of the morpheme constituents. A comparable pattern of ERP results was described by Leminen, Leminen, Kujala, and Shtyrov (2013), showing that derived words produced an enhanced mismatch negativity (MMN) 130–170 msec after suffix onset compared to their derived nonword counterparts (see also Ettinger, Linzen, & Marantz, 2014, for related evidence from MEG). However, although these studies clearly demonstrate that morphological information influences spoken word recognition, they do not address the important question of whether the processing of spoken complex words is largely determined by semantic transparency, as several behavioural studies would suggest (e.g., Marslen-Wilson et al., 1994; Wurm, 1997, 2000), or if form-based morphological segmentation mechanisms operate independently of semantics, as repeatedly reported in the visual domain (e.g., Amenta & Crepaldi, 2012; Beyersmann, Ziegler, et al., 2016; Dominguez et al., 2004; Rastle & Davis, 2008; Rastle et al., 2004; Royle et al., 2012; Stockall & Marantz, 2006).

    View all citing articles on Scopus
    View full text