Elsevier

Neuropsychologia

Volume 88, 29 July 2016, Pages 113-122
Neuropsychologia

A matter of attention: Crossmodal congruence enhances and impairs performance in a novel trimodal matching paradigm

https://doi.org/10.1016/j.neuropsychologia.2015.07.022Get rights and content

Highlights

  • A novel trimodal paradigm for probing crossmodal matching is proposed.

  • Intensity changes in two out of three stimulated modalities have to be matched.

  • Crossmodal congruence between attended modalities facilitates matching.

  • Unattended crossmodal congruence impairs matching.

  • Visual–tactile matching outperforms audio–visual and audio–tactile matching.

Abstract

A novel crossmodal matching paradigm including vision, audition, and somatosensation was developed in order to investigate the interaction between attention and crossmodal congruence in multisensory integration. To that end, all three modalities were stimulated concurrently while a bimodal focus was defined blockwise. Congruence between stimulus intensity changes in the attended modalities had to be evaluated. We found that crossmodal congruence improved performance if both, the attended modalities and the task-irrelevant distractor were congruent. If the attended modalities were incongruent, the distractor impaired performance due to its congruence relation to one of the attended modalities. Between attentional conditions, magnitudes of crossmodal enhancement or impairment differed. Largest crossmodal effects were seen in visual–tactile matching, intermediate effects for audio–visual and smallest effects for audio–tactile matching. We conclude that differences in crossmodal matching likely reflect characteristics of multisensory neural network architecture. We discuss our results with respect to the timing of perceptual processing and state hypotheses for future physiological studies. Finally, etiological questions are addressed.

Introduction

Any instant of conscious perception is shaped by the differential contributions of all of our sensory organs. Integration of these different sensory inputs is not merely additive but involves crossmodal interactions. The mechanisms of these interactions are still far from being completely understood. One on-going challenge in the field of multisensory research is the question of how crossmodal interactions can be identified and quantified (Gondan and Röder, 2006, Stevenson et al., 2014).

In many cases, crossmodal interactions have been investigated by means of redundant signal detection paradigms in which performance in unimodal trials is compared to performance in redundant multimodal trials (Diederich and Colonius, 2004). The race model inequality introduced by Miller (1982) is commonly used to decide if performance increments are actually indicative of crossmodal interactions. These are inferred if the multimodal cumulative distribution function (CDF) of response times is larger than the sum of the unimodal CDFs. Otherwise, performance increments are deemed to be due to statistical facilitation (Raab, 1962). Stimulus design and presentation in redundant signal detection paradigms were typically chosen such that multisensory principles formulated by Stein and Meredith (1993) could be tested. These principles were deduced from the observation that multisensory neurons showed strongest crossmodal effects when stimuli to distinct modalities shared temporal and spatial characteristics, and crossmodal effects increased when unimodal stimulus intensities decreased (Sarko et al., 2012).

Although supporting evidence for the validity of these principles in human behavior exists (e.g., Bolognini et al., 2004; Senkowski et al., 2011), an increasing number of empirical null results and methodological issues question the general applicability of these principles (Holmes, 2007, Otto et al., 2013, Pannunzi et al., 2015, Spence, 2013, Sperdin et al., 2010). Additionally, it has been demonstrated that crossmodal interactions can also have competitive effects, leading to crossmodal inhibition rather than enhancement (Sinnett et al., 2008). In an audio–visual redundant signal detection paradigm, auditory detection was delayed by redundant visual presentation while visual detection was speeded by redundant auditory presentation. Similarly, Wang et al. (2012) presented results suggesting the coexistence of crossmodal inhibition and enhancement in a trimodal study. In a target detection task, participants were presented with visual, auditory or somatosensory targets in the presence or absence of perceptual competition by the respective other modalities. Overall, visual detection was fastest whereas auditory and somatosensory detection was comparable. Interestingly, they observed that the detection of auditory targets was impaired by perceptual competition while vision was unaffected and tactile detection facilitated. These crossmodal effects can be understood in the context of perceptual gain adjustments due to multisensory interactions (Yi Jiang and Han, 2007).

The inconsistency of results concerning crossmodal interactions might relate to aspects of multisensory processing that are typically not addressed by redundant signal detection paradigms. An important aspect of crossmodal interactions is the evaluation of crossmodal matching or conflict. As a result, matching sensory inputs will be bound into a coherent percept of an event, whereas sensory information will be processed separately if it is unlikely that a common origin is emitting these signals (Engel and Singer, 2001, Senkowski et al., 2008, Treisman, 1996). Accordingly, congruence between stimulus features in distinct sensory modalities leads to crossmodal enhancement of sensory processing, resulting in effective binding of corresponding neural representations. By contrasting congruent to incongruent multisensory stimulus presentation, the extent of crossmodal congruence enhancement can be probed. Congruence in this context can be related to low-level spatio-temporal characteristics but also to more complex stimulus features such as, for example, semantic aspects of the stimuli or crossmodal correspondences. Concerning the former, detection of natural objects was improved if auditory and visual information matched semantically (Schneider et al., 2008, Yuval-Greenberg and Deouell, 2007). Crossmodal correspondences, on the other hand, relate to stimulus features that are consistently associated crossmodally and are, thus, described as “natural cross-modal mappings” (Evans and Treisman, 2010). A very robust correspondence, for instance, could be established between auditory pitch and visual brightness (Marks, 1987). That is, detection of visual and auditory stimuli is improved when jointly presented in a congruent fashion (e.g. high pitch tone and bright visual stimulus) compared to incongruent presentation (e.g. low pitch tone and bright visual stimulus). Relations between the mechanisms of crossmodal correspondences and synesthetic experiences are being discussed (Spence, 2011).

In addition to stimulus-driven aspects of multisensory integration, there is growing awareness that attention plays a central role in how multisensory inputs are being integrated (Talsma, 2015). One possibility is that attention-related top-down mechanisms may generally enhance sensory processing of an attended event at the expense of other sensory input (Desimone and Duncan, 1995, Kastner and Ungerleider, 2001, Wascher and Beste, 2010). The interplay between attention and mechanisms of crossmodal integration, however, is more complex and impact seems to be exerted mutually (Talsma et al., 2010). On the one hand, it was shown that spatial attention directly interfered with mechanisms of crossmodal integration in an audio–visual redundant target detection paradigm (Talsma and Woldorff, 2005). In this study, multisensory effects on event-related potentials at fronto-central sites were larger for attended stimuli compared to unattended stimuli. Available data suggest that attention might not modulate crossmodal interactions in an all-or-nothing manner, but rather shapes the nature or magnitude of crossmodal integration (De Meo et al., 2015). Supporting evidence for this has also been obtained in studies where congruence-related crossmodal effects are larger when attention is divided between modalities compared to when it is focused on one modality (Göschl et al., 2014, Mozolic et al., 2008). On the other hand, it was shown that attentional allocation can be driven by mechanisms of multisensory integration in a visual search paradigm (Van der Burg et al., 2008). Reaching similar conclusions, van Ee et al. (2009) showed that crossmodal congruence enhanced attentional control in perceptual selection.

Here we propose a novel trimodal matching paradigm to investigate interactions between attention and crossmodal congruence. To that end, participants receive simultaneous visual, auditory, and somatosensory stimulation on each trial. All three stimuli undergo a brief, simultaneous change in intensity (either increase or decrease) resulting in varying patterns of crossmodal congruence across trials. Per block, an attentional focus including two relevant modalities is defined for which congruence has to be evaluated irrespective of the third, task-irrelevant modality. Four different congruence patterns can be discerned (see Fig. 1 for an example)

  • (I.a) All stimuli are congruent.

  • (I.b) The attended modalities are congruent, and the task-irrelevant modality is incongruent to both attended modalities.

  • (II.a) The attended modalities are incongruent, and the task-irrelevant modality is congruent to attended modality 1.

  • (II.b) The attended modalities are incongruent, and the task-irrelevant modality is congruent to attended modality 2.

In contrast to redundant signal detection paradigms, this trimodal paradigm requires the participant to evaluate crossmodal congruence and thereby ensures that information from both attended modalities must be processed in order to reach a decision. In a study using a similar paradigm, crossmodal congruence significantly speeded responses for audio–visual matching (Friese et al., 2015). Employing a trimodal design allows to compare crossmodal matching between different bimodal foci of attention and to study the influence of the respective distractor. Our hypotheses are three-fold. First, in line with the literature on redundant target effects, we expect that performance in fully congruent trials is superior to trials with a task-irrelevant deviant (Fig. 1I.a vs. I.b). Second, we hypothesize that the magnitude of performance will depend on how easily the distracting, task-irrelevant modality can be inhibited. This will in turn depend on the attentional focus. Based on Wang et al. (2012), it is assumed that vision is hardest to ignore and audition easiest. Accordingly, performance in the visual–tactile focus condition should be best while performance in the audio–tactile focus condition should be worst. Third, with respect to incongruent task-relevant stimuli, we expect that congruence between one attended stimulus and the distractor impairs performance. Mirroring the expectations outlined above, the distracting effect of congruence should also depend on the focus of attention.

Section snippets

Participants

Forty-nine participants were recruited for the study and received monetary compensation for their participation. Due to the high demands of the task, fifteen candidates were not able to complete the training successfully and were excluded from further participation (performance was below an average of 70% correct answers after 30 min of training). The remaining 34 participants were on average 24 years old (±4 years) and 20 of them were male (14 female). All had normal or corrected to normal

Results

The results of the thresholding procedure are outlined in Fig. 3. Smallest detection thresholds were found for auditory changes (mean percentage deviation from baseline intensity with standard deviations; increase: 21.64±7.74, decrease: 24.93±9.22), medium-sized thresholds for visual changes (increase: 32.60±9.98, decrease: 32.28±6.99) and largest thresholds for tactile changes (increase: 39.59±11.08, decrease: 41.28±9.03). Average response times and response accuracies are depicted in Table 1.

Discussion

We investigated differences in crossmodal matching between bimodal combinations of visual, auditory, and somatosensory stimuli. Crossmodal congruence was used to modulate stimulus-driven mechanisms of multisensory integration. In line with our expectations, we found better performance for vision-including attentional foci. Furthermore, we observed that congruence between attended stimuli was associated with better performance compared to incongruent conditions. The difference between attended

Conclusions

Employing a novel trimodal paradigm, we found that crossmodal congruence improved performance if both, the attended modalities and the task-irrelevant distractor were congruent. On the other hand, if the attended modalities were incongruent, the distractor impaired performance because of its congruence relation to one of the attended modalities. Whether this effect is based on an automatic relocation of attention to the congruent stimulus pair or due to a conflict to the required response,

Author contributions

A.K. Engel developed the study concept. All authors contributed to the study design. J. Misselhorn performed testing and data collection. J. Misselhorn conducted data analysis and interpretation with advice from U. Friese under supervision of A.K. Engel. J. Misselhorn drafted the manuscript, and all other authors provided critical revisions. All authors approved the final version of the manuscript for submission.

Acknowledgments

This research was supported by Grants from the German Research Foundation (SFB 936/A3) and the European Union (ERC-2010-AdG-269716).

References (44)

  • S. Sinnett et al.

    The co-occurrence of multisensory competition and facilitation

    Acta Psychol.

    (2008)
  • H.F. Sperdin et al.

    Auditory–somatosensory multisensory interactions in humans: dissociating detection and spatial discrimination

    Neuropsychologia

    (2010)
  • D. Talsma et al.

    The multifaceted interplay between attention and multisensory integration

    Trends Cognit. Sci.

    (2010)
  • A. Treisman

    The binding problem

    Curr. Opin. Neurobiol.

    (1996)
  • N. Bolognini et al.

    “Acoustical vision” of below threshold stimuli: interaction among spatially converging audiovisual inputs

    Exp. Brain Res.

    (2004)
  • J. Cohen

    Statistical Power Analysis for the Behavioral Sciences

    (1988)
  • R. De Meo et al.

    Top-down control and early multisensory processes: chicken vs. egg

    Front. Integr. Neurosci.

    (2015)
  • R. Desimone et al.

    Neural mechanisms of selective visual attention

    Annu. Rev. Neurosci.

    (1995)
  • A. Diederich et al.

    Bimodal and trimodal multisensory enhancement: effects of stimulus onset and intensity on reaction time

    Percept. Psychophys.

    (2004)
  • R. van Ee et al.

    Multisensory congruency as a mechanism for attentional control over perceptual selection

    J. Neurosci.

    (2009)
  • K.K. Evans et al.

    Natural cross-modal mappings between visual and auditory features

    J. Vis.

    (2010)
  • F. Faul et al.

    G⁎Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences

    Behav. Res. Methods

    (2007)
  • Cited by (0)

    1

    These authors contributed equally to this work.

    View full text