The role of morphology in phoneme prediction: Evidence from MEG
Introduction
Recent work has illuminated the role of morphology in visual word recognition. Evidence from both behavioral and brain-based studies strongly indicates that visually presented words are decomposed into morphemes based on their visual forms, and that this visual decomposition feeds lexical access for the lexical information associated with morphemes (Fiorentino and Poeppel, 2007a, Fiorentino and Poeppel, 2007b, Rastle et al., 2004, Solomyak and Marantz, 2010, inter alia). Recognition of visually complex words, then, follows the decomposition, look-up, and recomposition model championed by Taft and others based on behavioral data (Taft, 2004, Taft and Forster, 1975).
The role of morphological structure in auditory word recognition is less studied and less well understood, though it has been known for some time that morphological structure plays a role in auditory word recognition as well (Marslen-Wilson, Tyler, Waksler, & Older, 1994). Recent studies by Baayen, Wurm, and colleagues point to a predictive role for morpheme recognition during auditory word recognition (Balling and Baayen, 2008, Balling and Baayen, 2012, Wurm, 1997, Wurm et al., 2006). In particular, Balling and Baayen, 2008, Balling and Baayen, 2012 contrast two general models of recognition. On one, all of the full words consistent with the auditory input, the full word cohort of the word eventually recognized, are activated during recognition, with their competing representations compared against the incoming acoustic signal (Marslen-Wilson, 1987, Marslen-Wilson and Welsh, 1978). On this class of models, the morphological decomposition of members of the cohort would not be relevant to cohort competition and would not interact with prediction of upcoming phonemes from the cohort consistent with the already processed phonemes. A second model would leave a role for recognition of component morphemes of the word being processed (Balling and Baayen, 2008, Balling and Baayen, 2012). Recognition of a morpheme, for example a morphological prefix or a stem, would yield predictions for upcoming morphemes, and upcoming phonemes as part of these morphemes. We test the hypothesis that this additional prediction associated with morphological structure might enhance the prediction of upcoming phonemes based on the cohort of full words consistent with the input.
In addition to questions associated with the role of morphology in auditory word recognition, the actual mechanisms, both cognitive and neural, whereby a cohort of possible words influences prediction and processing of the incoming speech stream has also been addressed in the recent literature. On the one hand, one might endorse a competition model in which all members of a cohort are activated to an extent proportional to their frequency of occurrence, with lateral competition between activated cohort members (Marslen-Wilson, 1987, Marslen-Wilson and Welsh, 1978). On this view, the more members of a cohort and the more evenly distributed their frequency, the more cognitive and neural activity associated with activation and active inhibition. In a recent paper, Gagnepain, Henson, and Davis (2012) suggest that cohort competition might not be a main driver of brain activity associated with auditory word recognition. Rather, they suggest, the unexpectedness of the incoming input given the probability distribution over possible continuations from what has already been processed is the factor that drives neural activity and affects response times. They suggest that activity from areas around auditory cortex reflect the mismatch between predicted and incoming phonemic material—a prediction error signal. Their results are also consistent with the activity reflecting surprisal, which quantifies the change in the probability distribution over the members of the active cohort based on the incoming stimulus (Hale, 2001).
For the role of morphology in auditory word recognition, Balling and Baayen (2012) propose that online morphological decomposition leads to prediction, for stems from morphological prefixes and for suffixes from stems, that is observable in reaction times in lexical decision. All other things being equal, being morphologically complex aids in auditory word recognition over suitably matched monomorphemic words (Balling and Baayen, 2008, Ji et al., 2011). From Gagnepain et al. (2012), we derive the hypothesis that cohorts of words consistent with auditory input show their face neurally not through competition, where higher entropy among members of a cohort would lead to more activity, but through surprisal, where changes in the probability distribution over cohort members, with new input, might drive activity—or through prediction error, where observed lower probability continuations conflict with their higher probability cohort members. From neural and behavioral work, we might be led to an opposite conclusion about cohort competition at the beginning of the recognition of auditorily presented words: the higher the entropy over the cohort, the less activation we will observe (Baayen et al., 2007, Linzen et al., 2013, Wurm et al., 2006). On this view, which we will refer to as the Low-Entropy Dependent Prediction model (LEDP), higher entropy prevents commitment to prediction for upcoming input, leading to less predictive work and less neural activity, while low entropy allows more commitment to continuations, thus more work and more activity.
The present study is a preliminary look at whether contemporary magnetoencephalography (MEG) measurement and analysis techniques are suitable for investigating the role of morphology and of cohort entropy online, as participants listen to spoken language. In particular, we ask whether morphological complexity enhances the prediction of upcoming phonemic material such that we see more neural activity associated with surprisal during auditory word processing for morphologically complex as opposed to monomorphemic words.
To approach these questions we manipulated morphological complexity and surprisal in a 2-by-2 factorial design. The morphological complexity manipulation consisted of two morpheme conditions: monomorphemic and bimorphemic words. The surprisal manipulation consisted of two continuation surprisal conditions: high- and low-surprisal continuations of a shared string prefix. For bimorphemic pairs (e.g., bruises/bruiser), this string prefix consisted of a morphological stem (bruis-), and the continuations consisted of differing suffixes (-es/-er). For monomorphemic pairs (e.g., bourbon/burble), this shared string prefix consisted of phonological material that did not constitute a morpheme ([bɹb]), and the continuations were the final phonemes of the words, which similarly were not morphemes (e.g., -ən/-əl).
Each participant heard both words in each pair. This introduced a third variable: order of presentation within pair. Given evidence of long-distance morphological priming in auditory word recognition (Kouider & Dupoux, 2009), this variable is of particular note for bimorphemic pairs. As such, the ordering position of a stimulus relative to its pair counterpart was also taken into account in the analysis.
In addition to the factorial design described above, we also investigate whether we can detect neural activity related to individual phoneme prediction on a trial-by-trial basis (Gagnepain et al., 2012). We used a frequency database to calculate the probability distribution of all English words compatible with the string prefix at each time point in each trial. This enabled us to derive millisecond-by-millisecond estimates of two information-theoretic quantities: phoneme surprisal and cohort entropy.
The surprisal of the phoneme that is currently being heard is the inverse of the log of its conditional probability given the phonemes that preceded it. This probability can be calculated by dividing the total frequency of the present cohort by the total frequency of the cohort that was “alive” prior to hearing the current phoneme. Formally, if PrefFreq(x) is the summed frequency of all words that start with the phoneme sequence x, then the surprisal of the third phoneme u in bruiser ([bruzr]) would be given by
The cohort entropy is the entropy of the probability distribution over all words that are compatible with the string prefix heard thus far. If C is the cohort, f(w) is the frequency of a word w, and FC is the total frequency of the cohort, then the cohort entropy is given by:
We predict that higher phoneme surprisal should lead to increased neural activity. We also investigate whether we can observe an effect of cohort entropy, and, if so, whether the effect is such that neural activity increases with increased entropy, as predicted by competition models, or decreases with increased entropy, as predicted by the LEDP.
A third variable that we calculated was cohort frequency, which is the summed frequency of all the words in the cohort. The prior literature does not afford a specific hypothesis about the potential significance of this variable. However, in light of the pervasiveness of frequency effects in language processing, we examine the effect of this variable informally, and leave an in-depth investigation of its significance for future research.
In light of prior work suggesting a key role of error detection/surprisal in auditory processing, as well as work suggesting a predictive role for morpheme recognition in auditory processing, we expect to observe an effect of surprisal which is enhanced in morphologically complex words, relative to simple words. This interaction with morphological complexity should apply to the categorical variable of continuation surprisal, as well as to any effects of phoneme-by-phoneme surprisal.
We furthermore test for facilitatory effects of entropy early in the stimulus, as predicted by LEDP models. While cohort competition models predict an inhibitory effect of entropy during word comprehension, LEDP models would predict facilitation near word onset, when entropy is high.
In accordance with the results of Gagnepain et al. (2012), we expect to see effects of surprisal in auditory regions such as the superior temporal gyrus (STG) and transverse temporal gyrus (TTG). Given that morphological decomposition involves accessing lexical entries, we may also expect to see evidence of the predicted interaction with morphological complexity in the middle temporal gyrus (MTG). These will serve as our three regions of interest (ROIs) for the analysis.
Section snippets
Design and stimuli
The experiment consisted of an auditory lexical decision task, with simultaneous MEG recording of the magnetic fields induced by electrical activity in the brain. The factorial design included two two-level stimulus variables of interest: morphological complexity (bimorphemic and monomorphemic) and continuation surprisal (high- and low-surprisal continuations).
Stimulus words were chosen in pairs from the English Lexicon Project (ELP) (Balota et al., 2007). For bimorphemic words, continuations
Regions of interest analysis
Analysis was conducted on three ROIs in the temporal lobe: the transverse temporal gyrus (auditory cortex), superior temporal gyrus, and middle temporal gyrus. The anatomical FreeSurfer labels (Desikan et al., 2006) corresponding to these regions served as the ROIs for the analysis. The inverse solution over all trials was calculated within the target label of each individual subject. The transverse temporal, superior temporal, and middle temporal ROIs, and grand average activation (at around
Behavioral
Both accuracy and reaction time (RT) showed significant effects of morphological complexity (RT, p = .002; accuracy, p < .001) and continuation surprisal (RT, p < .001; accuracy, p < .001), such that monomorphemic words were responded to more slowly and less accurately than bimorphemic words, and high surprisal items were responded to more slowly and less accurately than low surprisal items. No significant interactions emerged between morphological complexity and continuation surprisal (RT, p = .124;
Discussion
This study attempted to address a number of questions, both theoretical and methodological. On the methodological level, the study sought to explore the effectiveness of MEG for investigating the role of morphology in phoneme prediction during auditory word recognition, as well as the effectiveness of millisecond-by-millisecond correlation of MEG data with information-theoretic variables time-locked to phoneme boundaries within stimuli.
Our results suggest positive answers to both of these
Conclusion
Recent work in auditory word recognition has provided evidence for phoneme-level prediction occurring during the processing of a spoken word. Additional work points to an interaction of these predictive processes with a word’s morphological structure. Drawing on these two lines of research, we investigated the role of morphological structure in phoneme prediction, while additionally exploring the precise manner in which the cohort of words consistent with the input affects the neural processes
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant No. BCS-0843969 and by the NYU Abu Dhabi Research Council under Grant G1001 from the NYUAD Institute, New York University Abu Dhabi. We thank Jeff Walker for technical assistance.
References (42)
- et al.
Mixed-effects modeling with crossed random effects for subjects and items
Journal of Memory and Language
(2008) - et al.
Random effects structure for confirmatory hypothesis testing: Keep it maximal
Journal of Memory and Language
(2013) - et al.
Dynamic statistical parametric mapping: Combining fMRI and MEG for high-resolution imaging of cortical activity
Neuron
(2000) - et al.
An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest
Neuroimage
(2006) - et al.
Processing of compound words: An MEG study
Brain and Language
(2007) - et al.
Temporal predictive codes for spoken words in auditory cortex
Current Biology
(2012) - et al.
Benefits and costs of lexical decomposition and semantic integration during the processing of transparent and opaque English compounds
Journal of Memory and Language
(2011) - et al.
Episodic accessibility and morphological processing: Evidence from long-term auditory priming
Acta Psychologica
(2009) - et al.
Nonparametric statistical testing of EEG-and MEG-data
Journal of Neuroscience Methods
(2007) Functional parallelism in spoken word-recognition
Cognition
(1987)
Processing interactions and lexical access during word recognition in continuous speech
Cognitive Psychology
The TRACE model of speech perception
Cognitive Psychology
The assessment and analysis of handedness: The Edinburgh inventory
Neuropsychologia
Lexical access in early stages of visual word processing: A single-trial correlational MEG study of heteronym recognition
Brain and Language
Lexical storage and retrieval of prefixed words
Journal of Verbal Learning and Verbal Behavior
Auditory processing of prefixed English words is both continuous and decompositional
Journal of Memory and Language
Reduction of non-periodic environmental magnetic noise in MEG measurement by continuously adjusted least squares method
IEEE Transactions on Applied Superconductivity
Lexical dynamics for low-frequency complex words: A regression study across tasks and modalities
The Mental Lexicon
Morphological effects in auditory word recognition: Evidence from Danish
Language and Cognitive Processes
Probability and surprisal in auditory comprehension of morphologically complex words
Cognition
The English lexicon project
Behavior Research Methods
Cited by (45)
Magnetoencephalography and Language
2020, Neuroimaging Clinics of North AmericaLeft occipital and right frontal involvement in syntactic category prediction: MEG evidence from Standard Arabic
2019, NeuropsychologiaCitation Excerpt :The accumulated evidence shows that predictive processes operate on several levels of linguistic representation. Behavioral (e.g., Warren, 1970 — ‘phoneme restoration’) and neurophysiological work (e.g., Gagnepain et al., 2012; Ettinger et al., 2014; Ylinen et al., 2016; see also Gwilliams and Marantz, 2015, for work on Standard Arabic) has provided evidence for predictive processing at the phonological level. There is also a considerable amount of evidence for prediction generation at the lexical level (e.g., Kutas and Hillyard, 1984; Lau et al., 2013; Lau et al., 2016; Maess et al., 2016b), showing that the degree to which a target word is predictable modulates brain activation levels as measured by electro- and magnetoencephalography (EEG and MEG, respectively), both before and after the onset of the target word.
Morphological processing without semantics: An ERP study with spoken words
2019, CortexCitation Excerpt :This larger negativity (resembling the N400) in the illegal nonword condition was interpreted as reflecting a more difficult lexical-semantic integration of the morpheme constituents. A comparable pattern of ERP results was described by Leminen, Leminen, Kujala, and Shtyrov (2013), showing that derived words produced an enhanced mismatch negativity (MMN) 130–170 msec after suffix onset compared to their derived nonword counterparts (see also Ettinger, Linzen, & Marantz, 2014, for related evidence from MEG). However, although these studies clearly demonstrate that morphological information influences spoken word recognition, they do not address the important question of whether the processing of spoken complex words is largely determined by semantic transparency, as several behavioural studies would suggest (e.g., Marslen-Wilson et al., 1994; Wurm, 1997, 2000), or if form-based morphological segmentation mechanisms operate independently of semantics, as repeatedly reported in the visual domain (e.g., Amenta & Crepaldi, 2012; Beyersmann, Ziegler, et al., 2016; Dominguez et al., 2004; Rastle & Davis, 2008; Rastle et al., 2004; Royle et al., 2012; Stockall & Marantz, 2006).
Do ‘early’ brain responses reveal word form prediction during language comprehension? A critical review
2019, Neuroscience and Biobehavioral Reviews