Elsevier

Schizophrenia Research

Volume 97, Issues 1–3, December 2007, Pages 173-183
Schizophrenia Research

Impaired multisensory processing in schizophrenia: Deficits in the visual enhancement of speech comprehension under noisy environmental conditions

https://doi.org/10.1016/j.schres.2007.08.008Get rights and content

Abstract

Background

Viewing a speaker's articulatory movements substantially improves a listener's ability to understand spoken words, especially under noisy environmental conditions. In this study we investigated the ability of patients with schizophrenia to integrate visual and auditory speech. Our objective was to determine to what extent they experience benefit from visual articulation and to detail under what listening conditions they might show the greatest impairments.

Methods

We assessed the ability to recognize auditory and audiovisual speech in different levels of noise in 18 patients with schizophrenia and compared their performance with that of 18 healthy volunteers. We used a large set of monosyllabic words as our stimuli in order to more closely approximate performance in everyday situations.

Results

Patients with schizophrenia showed deficits in their ability to derive benefit from visual articulatory motion. This impairment was most pronounced at signal-to-noise levels where multisensory gain is known to be maximal in healthy control subjects. A surprising finding was that despite known early auditory sensory processing deficits and reports of impairments in speech processing in schizophrenia, patients' performance in unisensory auditory speech perception remained fully intact.

Conclusions

Thus, the results showed a specific deficit in multisensory speech processing in the absence of any measurable deficit in unisensory speech processing and suggest that sensory integration dysfunction may be an important and, to date, rather overlooked aspect of schizophrenia.

Introduction

The integration of heard speech signals with the seen articulatory movements of a speaker's face and mouth is essential for everyday communication, as seeing a speaker's face substantially facilitates the recognition of spoken words, especially under noisy listening conditions (e.g. Erber, 1969, Grant and Seitz, 2000, Munhall et al., 2004a, O'Neill, 1954, Ross et al., 2007, Sumby and Pollack, 1954). The brain processes underlying this multisensory speech integration are presently under intense investigation (Bernstein et al., 2004, Callan et al., 2003, Calvert, 2001, Calvert and Campbell, 2003, Campbell and MacSweeney, 2004, Munhall et al., 2002, Munhall et al., 2004b) and investigators have now begun to explore whether there is a specific role for multisensory processes in some of the perceptual deficits seen in disorders such as autism (Iarocci and McDonald, 2006, Kern, 2002) and schizophrenia (de Gelder et al., 2003). In schizophrenia, past research has established the existence of robust deficits within-modality where early auditory and visual sensory processing has been shown to be impaired (e.g. Butler et al., 2006, Foxe et al., 2001, Foxe et al., 2005, Schwartz et al., 2001). Given these early unisensory deficits, there is good reason to predict that multisensory processes, which clearly rely on the fidelity of early sensory inputs from the respective unisensory systems, will show similar, if not greater impairment. Given extensive physiological evidence that multisensory integration can act as a non-linear gain mechanism (Foxe and Schroeder, 2005, Meredith and Stein, 1986, Molholm et al., 2004, Molholm et al., 2006, Stein et al., 2001, Stein et al., 2002, Schroeder and Foxe, 2005), it seems a reasonable prediction that impairment of multisensory processing might well be especially impaired in this population.1

Indeed, recent evidence does suggest processing deficits in schizophrenia. In a cleverly constructed study, de Gelder and colleagues (de Gelder et al., 2003) used a variant of the so-called “McGurk illusion” (McGurk and MacDonald, 1976, Saint-Amour et al., 2006) to assess whether patients have deficits in integrating auditory and visual speech. For the reader unfamiliar with the McGurk illusion, the following example will be helpful. When participants attend to a video of a speaker articulating the syllable /ga/ while listening to the incongruent auditory syllable /ba/, the listener typically reports the perception of the fused syllable /da/, and this occurs despite that fact that the /da/ syllable was neither heard nor seen. There are numerous other examples of these phonemic fusions and the illusion is very strong such that even when the listener is fully apprised of the ‘trick,’ it is difficult or even impossible to suppress it (Massaro, 1998). In de Gelder's experiment, patients were much less susceptible to these illusory fusions than healthy participants, whereas performance in an audiovisual control task involving spatial localization of sounds remained unimpaired. The authors hypothesized that if there was a general deficit in multisensory integration, patients would show decrements in both tasks. The results favored the notion of an isolated deficit related to the integration of phonetic information. However, somewhat contradictory evidence comes from a study by Surguladze et al., where schizophrenia patients and controls showed similar susceptibility to fusions in a McGurk-type experiment (Surguladze et al., 2001).

In both of these previous studies, the premise was that susceptibility to McGurk fusions would index an intact audiovisual integration system, although it should be also be pointed out that a small proportion of healthy control subjects do not experience McGurk fusions. Nonetheless, the vast majority of normal observers do in fact perceive these fusions, and so these studies took advantage of this fact to assess whether, on average, patients would experience lower levels of fusion. The McGurk-task, however, where mostly simple syllables are used, could be considered a rather indirect and non-ecological means of assessing multisensory performance.2 Due in part to the rather artificial nature of the McGurk-task, we reasoned that testing patients with schizophrenia on an audiovisual task using real words as opposed to syllables would provide a better test of their abilities for audiovisual integration of speech in real-life situations. More importantly, it has also been suggested that an impairment in auditory speech recognition in general (Hoffman et al., 1999, Lebib et al., 2003), and the integration of auditory and visual speech in particular (Surguladze et al., 2001), is most likely to manifest itself in situations where the auditory signal is degraded. We would therefore expect a deficit in speech processing to predominate when patients are asked to identify speech under noisy environmental conditions that are more typical of normal everyday social situations.

Furthermore, we expected to find the most robust deficit in multisensory speech perception under environmental conditions where healthy control subjects usually experience the most benefit from seeing the speaker's articulations. In a recent experiment from our laboratory (Ross et al., 2007), we showed that the gain derived from viewing visual articulations is maximal at intermediate signal-to-noise ratios (SNRs) in healthy volunteers. Here, we investigated the ability of patients with schizophrenia to integrate visual and auditory speech. Our objective was to determine to what extent they experience benefit from visual articulation and to detail under what listening conditions (SNRs) they might show the greatest impairments. For that, we assessed their ability to recognize auditory and audiovisual speech in different levels of noise and compared their performance with that of healthy volunteers. We used a large, normed set of monosyllabic words as our stimuli in order to more closely approximate performance in everyday situations without delivering semantic, grammatical or prosodic context.

Section snippets

Subjects

Informed consent was obtained from 18 patients (1 woman, mean age: 39, SD: 10.6) meeting the DSM-IV criteria for schizophrenia (n = 15) or schizoaffective disorder (n = 3) and 18 healthy volunteers (7 women, mean age: 35, SD: 11.6) at the Nathan Kline Institute (NKI) for Psychiatric Research (Orangeburg, NY). NKI's Institutional Review Board approved all procedures. Please refer to Table 1 for the sample characteristics of the patients with schizophrenia. All patients and controls had normal or

Results

A 2X7X2 repeated measures analysis of variance (RM-ANOVA) with the factors of condition (A and AV) and SNR level (1–7) and the between groups factor patients (P) vs. controls (C) was employed to analyze the data. Overall, the level of noise affected recognition performance significantly in both conditions, F(1, 34) = 1871.4, p < 0.001, η2 = 0.98; the lower the SNR, the fewer words that were recognized (see Fig. 1). In the auditory-alone (A) condition we can see a monotonic increase ranging from a

Discussion

Here, we set out to assess the integrity of multisensory audiovisual processing in patients with schizophrenia, with an emphasis on determining whether patients would benefit similarly to healthy controls by seeing speakers' articulations while trying to recognize spoken words embedded in various levels of background noise. A rather surprising finding was that despite very well-characterized deficits in early unisensory auditory processing (e.g. Javitt et al., 1993, Javitt et al., 1995, Javitt

Conclusions

In conclusion, patients with schizophrenia showed deficits in their ability to derive benefit from visual articulatory motion while unisensory auditory speech perception remained fully intact. It is possible that dysfunction in audio–visual speech integration is related to a well-characterized dysfunction of the dorsal visual processing stream but this remains to be explicitly examined.

Role of funding source

Support for this work was provided by grants to Professor Foxe from the National Institute of Mental Health (MH65350) and the National Institute on Aging (AG22696) and to Dr. Javitt from the National Institute of Mental Health (MH49334 and MH01439). Ms. Leavitt was supported by a Ruth L. Kirschstein pre-doctoral fellowship (NRSA - MH074284) from the National Institute of Mental Health (NIMH). Dr. Molholm was supported by a Ruth L. Kirschstein post-doctoral fellowship (NRSA - MH068174) from the

Contributors

Mr. Ross designed the stimulus sequences, programmed all paradigms, analyzed all data and wrote the first draft of the manuscript. Dr. Saint-Amour aided in the design and setup of the experimental paradigm, provided statistical help and commented critically on multiple drafts of the manuscript. Professor Foxe designed the experimental protocol and edited multiple drafts of the manuscript. Ms. Leavitt helped in the collection of data, the editing and preparation of the video and audio clip

Conflict of interest

All authors declare no conflicts of interest, financial or otherwise.

Acknowledgements

We are deeply indebted to the team at the Cognitive Neurophysiology Laboratory for their dedication and hard work. Thanks also go to Ms. Gail Silipo for her assistance in recruiting subjects and her enduring dedication to the patients. The principle investigator, Dr. Foxe, takes responsibility for the integrity of the data and the accuracy of the data analysis, and attests that all authors had full access to all the data in the study.

References (94)

  • D. Kim et al.

    Magnocellular contributions to impaired motion processing in schizophrenia

    Schizophr. Res.

    (2006)
  • R. Lebib et al.

    Evidence of a visual-to-auditory cross-modal sensory gating phenomenon as reflected by the human P50 event-related brain potential modulation

    Neurosci. Lett.

    (2003)
  • D.I. Leitman et al.

    Sensory contributions to impaired prosodic processing in schizophrenia

    Biol. Psychiatry

    (2005)
  • D.S. Manoach

    Prefrontal cortex dysfunction during working memory performance in schizophrenia: reconciling discrepant findings

    Schizophr. Res.

    (2003)
  • D.S. Manoach et al.

    Schizophrenic subjects activate dorsolateral prefrontal cortex during a working memory task, as measured by fMRI

    Biol. Psychiatry

    (1999)
  • P.T. Michie

    What has MMN revealed about the auditory system in schizophrenia?

    Int. J. Psychophysiol.

    (2001)
  • P.T. Michie et al.

    Duration mismatch negativity in biological relatives of patients with schizophrenia spectrum disorders

    Biol. Psychiatry

    (2002)
  • M. Ruchsow et al.

    Semantic and syntactic processes during sentence comprehension in patients with schizophrenia: evidence from event-related potentials

    Schizophr. Res.

    (2003)
  • C.E. Schroeder et al.

    Multisensory contributions to low-level, “Unisensory” processing

    Curr. Opin. Neurobiol.

    (2005)
  • M.E. Shenton et al.

    A review of MRI findings in schizophrenia

    Schizophr. Res.

    (2001)
  • B.E. Stein et al.

    Nonvisual influences on visual-information processing in the superior colliculus

    Prog. Brain Res.

    (2001)
  • S.A. Surguladze et al.

    Audio–visual speech perception in schizophrenia: an fMRI study

    Psychiatry Res.

    (2001)
  • D. Titone et al.

    Lexical competition and spoken word identification in schizophrenia

    Schizophr. Res.

    (2004)
  • S. Weinstein et al.

    Do you hear what I hear? Neural correlates of thought disorder during listening to speech in schizophrenia

    Schizophr. Res.

    (2006)
  • C. Alain et al.

    Deficits in automatically detecting changes in conjunction of auditory features in patients with schizophrenia

    Psychophysiology

    (2002)
  • C.A. Baltaxe et al.

    Speech and language disorders in children and adolescents with schizophrenia

    Schizophr. Bull.

    (1995)
  • Barnett

    Overview of speech intelligibility

    Proc. IOA

    (1999)
  • L.J. Bernstein et al.

    Audiovisual Speech Binding: Convergence or Association?

    (2004)
  • J.N. Buchan et al.

    Spatial statistics of gaze fixations during dynamic face processing

    Social Neurosci.

    (2007)
  • H.C. Bull et al.

    Speech perception in schizophrenia

    Br. J. Psychiatry

    (1974)
  • P.D. Butler et al.

    Visual white matter integrity in schizophrenia

    Am. J. Psychiatry

    (2006)
  • P.D. Butler et al.

    Subcortical visual dysfunction in schizophrenia drives secondary cortical impairments

    Brain

    (2007)
  • D.E. Callan et al.

    Neural processes underlying perceptual enhancement by visual speech gestures

    NeuroReport

    (2003)
  • G.A. Calvert

    Crossmodal processing in the human brain: insights from functional neuroimaging studies

    Cereb. Cortex

    (2001)
  • G.A. Calvert et al.

    Reading speech from still and moving faces: the neural substrates of visible speech

    J. Cogn. Neurosci.

    (2003)
  • R. Campbell et al.

    Neuroimaging Studies of Cross-Modal Plasticity and Language Processing in Deaf People

    (2004)
  • M. Cannon et al.

    Evidence for early-childhood, pan-developmental impairment specific to schizophreniform disorder: results from a longitudinal birth cohort

    Arch. Gen. Psychiatry

    (2002)
  • CHABA, Working Group on Communication Aids for the Hearing-Impaired

    Speech perception aids forth the hearing impaired people: current status and needed research

    J. Acoust. Soc. Am.

    (1991)
  • K.M. Cienkowski et al.

    Auditory–visual speech perception and aging

    Ear Hear.

    (2002)
  • D.J. Dekle et al.

    Audiovisual integration in perception of real words

    Percept. Psychophys.

    (1992)
  • G.M. Doniger et al.

    Impaired visual object recognition and dorsal/ventral stream interaction in schizophrenia

    Arch. Gen. Psychiatry

    (2002)
  • H. Dudley

    The automatic synthesis of speech

    Proc. Natl. Acad. Sci. U.S.A.

    (1939)
  • N.P. Erber

    Interaction of audition and vision in the recognition of oral speech stimuli

    J. Speech Hear. Res.

    (1969)
  • M.B. First et al.

    Structured Clinical Interview for DSM-IV

    (1997)
  • J.J. Foxe et al.

    The case for a feedforward component in multisensory integration mechanisms

    NeuroReport

    (2005)
  • J.J. Foxe et al.

    Early visual processing deficits in schizophrenia: impaired P1 generation revealed by high-density electrical mapping

    NeuroReport

    (2001)
  • J.J. Foxe et al.

    Filling-in in schizophrenia: a high-density electrical mapping and source-analysis investigation of illusory contour processing

    Cereb. Cortex

    (2005)
  • Cited by (151)

    View all citing articles on Scopus
    View full text