Impaired multisensory processing in schizophrenia: Deficits in the visual enhancement of speech comprehension under noisy environmental conditions
Introduction
The integration of heard speech signals with the seen articulatory movements of a speaker's face and mouth is essential for everyday communication, as seeing a speaker's face substantially facilitates the recognition of spoken words, especially under noisy listening conditions (e.g. Erber, 1969, Grant and Seitz, 2000, Munhall et al., 2004a, O'Neill, 1954, Ross et al., 2007, Sumby and Pollack, 1954). The brain processes underlying this multisensory speech integration are presently under intense investigation (Bernstein et al., 2004, Callan et al., 2003, Calvert, 2001, Calvert and Campbell, 2003, Campbell and MacSweeney, 2004, Munhall et al., 2002, Munhall et al., 2004b) and investigators have now begun to explore whether there is a specific role for multisensory processes in some of the perceptual deficits seen in disorders such as autism (Iarocci and McDonald, 2006, Kern, 2002) and schizophrenia (de Gelder et al., 2003). In schizophrenia, past research has established the existence of robust deficits within-modality where early auditory and visual sensory processing has been shown to be impaired (e.g. Butler et al., 2006, Foxe et al., 2001, Foxe et al., 2005, Schwartz et al., 2001). Given these early unisensory deficits, there is good reason to predict that multisensory processes, which clearly rely on the fidelity of early sensory inputs from the respective unisensory systems, will show similar, if not greater impairment. Given extensive physiological evidence that multisensory integration can act as a non-linear gain mechanism (Foxe and Schroeder, 2005, Meredith and Stein, 1986, Molholm et al., 2004, Molholm et al., 2006, Stein et al., 2001, Stein et al., 2002, Schroeder and Foxe, 2005), it seems a reasonable prediction that impairment of multisensory processing might well be especially impaired in this population.1
Indeed, recent evidence does suggest processing deficits in schizophrenia. In a cleverly constructed study, de Gelder and colleagues (de Gelder et al., 2003) used a variant of the so-called “McGurk illusion” (McGurk and MacDonald, 1976, Saint-Amour et al., 2006) to assess whether patients have deficits in integrating auditory and visual speech. For the reader unfamiliar with the McGurk illusion, the following example will be helpful. When participants attend to a video of a speaker articulating the syllable /ga/ while listening to the incongruent auditory syllable /ba/, the listener typically reports the perception of the fused syllable /da/, and this occurs despite that fact that the /da/ syllable was neither heard nor seen. There are numerous other examples of these phonemic fusions and the illusion is very strong such that even when the listener is fully apprised of the ‘trick,’ it is difficult or even impossible to suppress it (Massaro, 1998). In de Gelder's experiment, patients were much less susceptible to these illusory fusions than healthy participants, whereas performance in an audiovisual control task involving spatial localization of sounds remained unimpaired. The authors hypothesized that if there was a general deficit in multisensory integration, patients would show decrements in both tasks. The results favored the notion of an isolated deficit related to the integration of phonetic information. However, somewhat contradictory evidence comes from a study by Surguladze et al., where schizophrenia patients and controls showed similar susceptibility to fusions in a McGurk-type experiment (Surguladze et al., 2001).
In both of these previous studies, the premise was that susceptibility to McGurk fusions would index an intact audiovisual integration system, although it should be also be pointed out that a small proportion of healthy control subjects do not experience McGurk fusions. Nonetheless, the vast majority of normal observers do in fact perceive these fusions, and so these studies took advantage of this fact to assess whether, on average, patients would experience lower levels of fusion. The McGurk-task, however, where mostly simple syllables are used, could be considered a rather indirect and non-ecological means of assessing multisensory performance.2 Due in part to the rather artificial nature of the McGurk-task, we reasoned that testing patients with schizophrenia on an audiovisual task using real words as opposed to syllables would provide a better test of their abilities for audiovisual integration of speech in real-life situations. More importantly, it has also been suggested that an impairment in auditory speech recognition in general (Hoffman et al., 1999, Lebib et al., 2003), and the integration of auditory and visual speech in particular (Surguladze et al., 2001), is most likely to manifest itself in situations where the auditory signal is degraded. We would therefore expect a deficit in speech processing to predominate when patients are asked to identify speech under noisy environmental conditions that are more typical of normal everyday social situations.
Furthermore, we expected to find the most robust deficit in multisensory speech perception under environmental conditions where healthy control subjects usually experience the most benefit from seeing the speaker's articulations. In a recent experiment from our laboratory (Ross et al., 2007), we showed that the gain derived from viewing visual articulations is maximal at intermediate signal-to-noise ratios (SNRs) in healthy volunteers. Here, we investigated the ability of patients with schizophrenia to integrate visual and auditory speech. Our objective was to determine to what extent they experience benefit from visual articulation and to detail under what listening conditions (SNRs) they might show the greatest impairments. For that, we assessed their ability to recognize auditory and audiovisual speech in different levels of noise and compared their performance with that of healthy volunteers. We used a large, normed set of monosyllabic words as our stimuli in order to more closely approximate performance in everyday situations without delivering semantic, grammatical or prosodic context.
Section snippets
Subjects
Informed consent was obtained from 18 patients (1 woman, mean age: 39, SD: 10.6) meeting the DSM-IV criteria for schizophrenia (n = 15) or schizoaffective disorder (n = 3) and 18 healthy volunteers (7 women, mean age: 35, SD: 11.6) at the Nathan Kline Institute (NKI) for Psychiatric Research (Orangeburg, NY). NKI's Institutional Review Board approved all procedures. Please refer to Table 1 for the sample characteristics of the patients with schizophrenia. All patients and controls had normal or
Results
A 2X7X2 repeated measures analysis of variance (RM-ANOVA) with the factors of condition (A and AV) and SNR level (1–7) and the between groups factor patients (P) vs. controls (C) was employed to analyze the data. Overall, the level of noise affected recognition performance significantly in both conditions, F(1, 34) = 1871.4, p < 0.001, η2 = 0.98; the lower the SNR, the fewer words that were recognized (see Fig. 1). In the auditory-alone (A) condition we can see a monotonic increase ranging from a
Discussion
Here, we set out to assess the integrity of multisensory audiovisual processing in patients with schizophrenia, with an emphasis on determining whether patients would benefit similarly to healthy controls by seeing speakers' articulations while trying to recognize spoken words embedded in various levels of background noise. A rather surprising finding was that despite very well-characterized deficits in early unisensory auditory processing (e.g. Javitt et al., 1993, Javitt et al., 1995, Javitt
Conclusions
In conclusion, patients with schizophrenia showed deficits in their ability to derive benefit from visual articulatory motion while unisensory auditory speech perception remained fully intact. It is possible that dysfunction in audio–visual speech integration is related to a well-characterized dysfunction of the dorsal visual processing stream but this remains to be explicitly examined.
Role of funding source
Support for this work was provided by grants to Professor Foxe from the National Institute of Mental Health (MH65350) and the National Institute on Aging (AG22696) and to Dr. Javitt from the National Institute of Mental Health (MH49334 and MH01439). Ms. Leavitt was supported by a Ruth L. Kirschstein pre-doctoral fellowship (NRSA - MH074284) from the National Institute of Mental Health (NIMH). Dr. Molholm was supported by a Ruth L. Kirschstein post-doctoral fellowship (NRSA - MH068174) from the
Contributors
Mr. Ross designed the stimulus sequences, programmed all paradigms, analyzed all data and wrote the first draft of the manuscript. Dr. Saint-Amour aided in the design and setup of the experimental paradigm, provided statistical help and commented critically on multiple drafts of the manuscript. Professor Foxe designed the experimental protocol and edited multiple drafts of the manuscript. Ms. Leavitt helped in the collection of data, the editing and preparation of the video and audio clip
Conflict of interest
All authors declare no conflicts of interest, financial or otherwise.
Acknowledgements
We are deeply indebted to the team at the Cognitive Neurophysiology Laboratory for their dedication and hard work. Thanks also go to Ms. Gail Silipo for her assistance in recruiting subjects and her enduring dedication to the patients. The principle investigator, Dr. Foxe, takes responsibility for the integrity of the data and the accuracy of the data analysis, and attests that all authors had full access to all the data in the study.
References (94)
- et al.
Social perception from visual cues: role of the STS region
Trends Cogn. Sci.
(2000) - et al.
Visual motion integration in schizophrenia patients, their first-degree relatives, and patients with bipolar disorder
Schizophr. Res.
(2005) Language disorder in schizophrenia as a developmental learning disorder
Schizophr. Res.
(2005)- et al.
Audio-visual integration in schizophrenia
Schizophr. Res.
(2003) - et al.
Impairment of early cortical processing in schizophrenia: an event-related potential confirmation study
Biol. Psychiatry
(1993) - et al.
Impaired mismatch negativity (MMN) generation in schizophrenia as a function of stimulus deviance, probability, and interstimulus/interdeviant interval
Electroencephalogr. Clin. Neurophysiol.
(1998) - et al.
Associated deficits in mismatch negativity generation and tone matching in schizophrenia
Clin. Neurophysiol.
(2000) - et al.
New atypical antipsychotic medications
J. Psychiatr. Res.
(1998) The possible role of the cerebellum in autism/PDD: disruption of a multisensory feedback loop
Med. Hypotheses
(2002)- et al.
Impaired visual recognition of biological motion in schizophrenia
Schizophr. Res.
(2005)
Magnocellular contributions to impaired motion processing in schizophrenia
Schizophr. Res.
Evidence of a visual-to-auditory cross-modal sensory gating phenomenon as reflected by the human P50 event-related brain potential modulation
Neurosci. Lett.
Sensory contributions to impaired prosodic processing in schizophrenia
Biol. Psychiatry
Prefrontal cortex dysfunction during working memory performance in schizophrenia: reconciling discrepant findings
Schizophr. Res.
Schizophrenic subjects activate dorsolateral prefrontal cortex during a working memory task, as measured by fMRI
Biol. Psychiatry
What has MMN revealed about the auditory system in schizophrenia?
Int. J. Psychophysiol.
Duration mismatch negativity in biological relatives of patients with schizophrenia spectrum disorders
Biol. Psychiatry
Semantic and syntactic processes during sentence comprehension in patients with schizophrenia: evidence from event-related potentials
Schizophr. Res.
Multisensory contributions to low-level, “Unisensory” processing
Curr. Opin. Neurobiol.
A review of MRI findings in schizophrenia
Schizophr. Res.
Nonvisual influences on visual-information processing in the superior colliculus
Prog. Brain Res.
Audio–visual speech perception in schizophrenia: an fMRI study
Psychiatry Res.
Lexical competition and spoken word identification in schizophrenia
Schizophr. Res.
Do you hear what I hear? Neural correlates of thought disorder during listening to speech in schizophrenia
Schizophr. Res.
Deficits in automatically detecting changes in conjunction of auditory features in patients with schizophrenia
Psychophysiology
Speech and language disorders in children and adolescents with schizophrenia
Schizophr. Bull.
Overview of speech intelligibility
Proc. IOA
Audiovisual Speech Binding: Convergence or Association?
Spatial statistics of gaze fixations during dynamic face processing
Social Neurosci.
Speech perception in schizophrenia
Br. J. Psychiatry
Visual white matter integrity in schizophrenia
Am. J. Psychiatry
Subcortical visual dysfunction in schizophrenia drives secondary cortical impairments
Brain
Neural processes underlying perceptual enhancement by visual speech gestures
NeuroReport
Crossmodal processing in the human brain: insights from functional neuroimaging studies
Cereb. Cortex
Reading speech from still and moving faces: the neural substrates of visible speech
J. Cogn. Neurosci.
Neuroimaging Studies of Cross-Modal Plasticity and Language Processing in Deaf People
Evidence for early-childhood, pan-developmental impairment specific to schizophreniform disorder: results from a longitudinal birth cohort
Arch. Gen. Psychiatry
Speech perception aids forth the hearing impaired people: current status and needed research
J. Acoust. Soc. Am.
Auditory–visual speech perception and aging
Ear Hear.
Audiovisual integration in perception of real words
Percept. Psychophys.
Impaired visual object recognition and dorsal/ventral stream interaction in schizophrenia
Arch. Gen. Psychiatry
The automatic synthesis of speech
Proc. Natl. Acad. Sci. U.S.A.
Interaction of audition and vision in the recognition of oral speech stimuli
J. Speech Hear. Res.
Structured Clinical Interview for DSM-IV
The case for a feedforward component in multisensory integration mechanisms
NeuroReport
Early visual processing deficits in schizophrenia: impaired P1 generation revealed by high-density electrical mapping
NeuroReport
Filling-in in schizophrenia: a high-density electrical mapping and source-analysis investigation of illusory contour processing
Cereb. Cortex
Cited by (151)
Audiovisual multisensory integration in individuals with reading and language impairments: A systematic review and meta-analysis
2023, Neuroscience and Biobehavioral ReviewsA systematic review of the neural correlates of multisensory integration in schizophrenia
2022, Schizophrenia Research: Cognition