INTRODUCTION

Sound localization is important for orienting and focusing attention (Darwin and Hukin 2000) and helps to segregate sounds from different sources in the environment (Moore and Gockel 2002). In humans, horizontal sound localization is based on interaural differences in sound arrival time and sound level. While there have been substantial advances in understanding the neural processing of interaural time and level differences (ITDs and ILDs) separately (see Grothe et al. 2010 for a review), the question as to whether, or at which stage, ITDs and ILDs are integrated into a common code of horizontal sound location still remains unanswered. Single neuron recordings in animals indicate that the initial processing of ITDs and ILDs is performed independently, involving different nuclei of the superior olivary complex (SOC) in the auditory brainstem (Tollin 2003). However, it has been proposed that an integrated code of ITDs and ILDs might emerge already at the subsequent processing level. Riedel and Kollmeier (2002) measured the prominent wave V deflection in the human auditory brainstem response (ABR), which is thought to be generated at the level of the lateral lemniscus or inferior colliculus (Burkard et al. 2007), and found that stimuli with ITDs and ILDs favoring the same ear elicited smaller wave V amplitudes than diotic stimuli (i.e. stimuli with zero ITD and ILD) or stimuli with ITDs and ILDs favoring opposite ears. Given that some listeners perceive stimuli with opposing ITDs and ILDs as central, like diotic stimuli, whereas stimuli with consistent ITDs and ILDs are perceived as lateral, Riedel and Kollmeier interpreted their findings as evidence that wave V reflects an integrated code of stimulus laterality. However, the wave V latency seems to tell a different story: it was longer for stimuli with opposing ITDs and ILDs than for either diotic stimuli or stimuli with consistent ITDs and ILDs. Given that stimuli with opposing ITDs and ILDs often elicit incoherent percepts, with multiple sound images relating to the individual ITD and ILD cues (Whitworth and Jeffress 1961; Hafter and Jeffress 1968), this result suggests that, contrary to Riedel and Kollmeier's conclusion, wave V may reflect not only the lateralization, but also the coherence of the sound image. This would suggest that at least some degree of separation of ITDs and ILDs remains at the level of wave V.

Three other attempts to investigate whether ITDs and ILDs are coded independently or integrated have looked at cortical responses (Schröger 1996; Tardif et al. 2006; Ungan et al. 2001). In contrast to Riedel and Kollmeier (2002), all of these studies concluded that ITDs and ILDs are still coded separately even at the level of the auditory cortex. Schröger measured the mismatch negativity (MMN; see Näätänen et al. 2011) to stimuli with an ITD only, an ILD only or with a combined ITD and ILD (both favoring the same ear) and found that the response to the combined condition was similar to the sum of the responses to the ITD- and ILD-only conditions. Schröger concluded that the response to the combined ITD and ILD constituted a linear superposition of responses from independent ITD- and ILD-sensitive generators. However, the ITD and ILD used by Schröger were very disparate in terms of their perceived laterality (82 ° versus 36 °, respectively). It is possible that the degree of integration of ITDs and ILDs depends on how consistent the cues are; it might be the case that disparate cues are processed independently, as if having arisen from different sound sources, while consistent cues are integrated. This might explain why stimuli with opposing ITDs and ILDs, like those used by Riedel and Kollmeier (2002), tend to elicit multiple sound images. Under this assumption, Schröger might have found a non-linear (super- or subadditive) superposition of the ITD- and ILD-related components in the combined ITD and ILD response, had the cues been perceptually more consistent. Super- or subadditive processing in neuroimaging data is commonly assumed to indicate functional integration (e.g. Calvert 2001). In contrast to Schröger, Ungan et al. (2001) and Tardif et al. (2006) investigated whether the cortical processing of ITDs and ILDs involves spatially separated or overlapping generators by comparing the scalp topography and estimating the sources of ITD- and ILD-evoked cortical responses. Like Schröger, Tardif et al. measured MMNs to ITDs and ILDs, whereas Ungan et al. measured responses to abrupt ITD or ILD changes in ongoing sounds. Both studies found differences in the hemispheric distribution of the ITD- and ILD-evoked responses, and both interpreted this finding as evidence for ITDs and ILDs being processed by separated generators. However, the hemispheric distribution of sound-evoked cortical responses has been shown to depend on the stimulus lateralization (e.g. McEvoy et al. 1994; Krumbholz et al. 2007). Neither Ungan et al. nor Tardif et al. took care to match the lateralization of their stimuli. The observed differences in hemispheric distribution could thus have been due to differences in stimulus lateralization, rather than separated processing of ITDs and ILDs.

The aim of the current study was to revisit the question of whether ITDs and ILDs are represented by separated or integrated codes in the human auditory cortex. The general paradigm was very similar to that used by Schröger (1996), in that we compared responses to an ITD and an ILD, presented either separately or combined. However, rather than using Schröger's MMN paradigm, we used the change response paradigm of Ungan et al. to isolate ITD- and ILD-related activity, because it yields a better response-to-noise ratio. In order to maximize the chances of detecting any integrated processing of ITDs and ILDs, we conducted a precursory psychophysical experiment to find ITDs and ILDs that were matched in terms of perceived laterality for each participant. If ITDs and ILDs are represented by independent codes in the auditory cortex, we would expect the response to the combined ITD and ILD to constitute a linear superposition of its ITD- and ILD-related components. The combined response should thus resemble the sum of the responses to the ITD- and ILD-only conditions. If, on the other hand, ITDs and ILDs are represented by an integrated code, the response to the combined condition might be expected to be either larger (superadditive) or smaller (subadditive) than the sum of the ITD- and ILD-only responses. In addition to comparing the responses to the combined and ITD- and ILD-only conditions, we also measured the scalp topographies and estimated the source distributions of the ITD- and ILD-only responses.

METHODS

In order to isolate cortical responses related to ITD and ILD processing, we measured the late auditory-evoked cortical potentials to a short probe stimulus (150 ms), which was preceded without gap by a longer adapter stimulus (see Fig. 1A, B). The adapter was long enough (1,500 ms) to ensure that the response to its onset had subsided before the probe onset (Fig. 1C). In different conditions, the probe had an ITD only, an ILD only or both. The adapter was always diotic (zero ITD and ILD). The experiment consisted of five conditions. In the “small ITD” condition, the probe had a fixed ITD of −250 μs (lateralized a little less than halfway towards the left ear; Toole and Sayers 1965) and no (zero) ILD. In the “small ILD” condition, the probe had an ILD and no ITD. The ILD was set individually to match the lateralization of the small (−250 μs) ITD (see below). In the “combined” condition, the probe was presented with both the small ITD and small ILD. Informal listening revealed that the lateralization of the combined ITD and ILD was approximately double that of either cue alone. In order to investigate the effect of this lateralization difference, we included two further ITD- and ILD-only conditions, referred to as “large ITD” and “large ILD”, where the ITD and ILD were individually matched to elicit the same lateralization as in the combined condition.

FIG. 1
figure 1

A, B Example waveforms of the small ITD (A) and small ILD (B) stimuli. The adapter is shown in black. In A, only the right-ear probe is shown (grey), because the left-ear probe was just a time-shifted version of the right-ear probe in this condition. In B, the left-ear probe is shown in blue and the right-ear probe is shown in red. C Grand average EEG response across all conditions and participants. The thin grey lines show the signals from the 64 recording channels. The red line highlights the vertex (Cz) channel and the black line shows the global field power (GFP) of the response. The dashed vertical lines mark the onsets of the adapter and probe, as well as the probe offset. EOR energy onset response, SR sustained response, CR change response, OffR offset response.

Stimuli

The stimuli consisted of noise, lowpass-filtered below 1 kHz to restrict the stimulus passband to the frequency range over which ITDs in the temporal fine structure can be perceived (Durlach and Colburn 1978). The adapter and probe had durations of 1,500 and 150 ms, respectively, and the silent gap between the end of the probe and the onset of the next adapter was 1,500 ms. They were gated on and off with 10-ms quarter-cosine ramps. At the transition between them, the ramps were cross-faded at their −3-dB points. ITDs were implemented by delaying the noise carrier in the right-ear stimulus and gating the left- and right-ear stimuli simultaneously to create ongoing ITDs. Onset ITDs could not be included, because the adapter and probe were presented without a gap. In the ITD-only conditions, the adapter and probe were presented at a constant level of 70 dB SPL (Fig. 1A). ILDs were implemented by decreasing the level of the right-ear stimulus by half of the relevant ILD relative to 70 dB SPL and increasing the level of the left-ear stimulus by the same amount. In the conditions that contained ILDs, the level of the adapter was switched randomly between the left- and right-ear levels of the probe. The time between the level switches was either 75 or 150 ms with equal probability. This was to minimize the confounding of any ILD-change response with a response to the monaural level change at the adapter–probe transition [Fig. 1B; see Ungan and Ozmen (1996) for a similar procedure]. Note that the average adapter level over time, which would be expected to determine the amount of adaptation (Seither-Preisler et al. 2004; Lanting et al. 2013), was still the same across conditions.

All stimuli were generated digitally on a trial-by-trial basis using Matlab (The Mathworks, Natick, MA, USA). They were digital-to-analogue converted with a 50-kHz sampling rate and 24-bit amplitude resolution and amplified using a Tucker Davis Technologies (Alachua, FL, USA) System 3 with an RP2.1 real-time processor and HB7 headphone amplifier. They were presented via K240 DF headphones (AKG, Vienna, Austria) in a double-walled sound-attenuated booth.

Electroencephalography (EEG) data acquisition

The EEG experiment consisted of four 20-min blocks. Within each block, each of the five stimulus conditions was presented 65 times in random order. This was a passive listening task; participants watched a silent, sub-titled movie to stay alert.

Late auditory-evoked cortical potentials were recorded with an EEG amplifier system (BrainAmp DC, Brain Products, Gilching, Germany) and an “infracerebral” cap fitted with 64 Ag/AgCl ring electrodes in a quasi-equidistant arrangement (Easycap, Herrsching, Germany). The infracerebral electrode arrangement is designed to cover a larger proportion of the head surface than the standard 10–20 arrangement to facilitate localization of sources in the region of auditory cortex. One additional electrode was placed below the right eye to monitor electro-ocular activity. Data were recorded continuously with a sampling rate of 500 Hz. They were filtered online between 0.1 and 250 Hz. Skin-to-electrode impedances were maintained below 5 kΩ. The ground electrode was placed on the midline just above the Fpz position, and the recording reference was Cz.

EEG data analysis

The EEG data were preprocessed using the BrainVision Analyzer software (Brain Products). They were (1) highpass-filtered at 0.25 Hz (−48 dB/oct roll-off), (2) corrected for eye-blink artefacts using a regression-based procedure (Gratton et al. 1983), (3) re-referenced to the average of all 64 channels (“average reference”), (4) lowpass-filtered at 35 Hz (−48 dB/oct roll-off) and (5) segmented into epochs covering the period from −200 to 2,000 ms relative to the adapter onset. Epochs that contained voltages exceeding ±75 μV were considered artifactual and excluded from further analysis. The remaining epochs were averaged for each participant and condition, baseline-corrected to the 100-ms period before the probe onset and then averaged across participants to create a grand average response for each condition.

Differences between conditions were tested by computing the bootstrap standard error (based on 1,000 bootstrap resamples; see Efron and Tibshirani 1994) of the root-mean-square (RMS) amplitude of the difference between the respective responses at each time point. The RMS amplitude of average-referenced EEG data is commonly referred to as global field power (GFP; e.g. Murray et al. 2008). A difference response was considered significant where the 99 % bootstrap confidence interval of its GFP did not contain zero for more than seven consecutive time points [corresponding to half the period of the highest frequency contained in the responses (35 Hz)]. The higher significance threshold level (p = 0.01) and the cluster threshold (seven time points) were used to minimize false positives as a result of conducting multiple tests over many time points.

The scalp topographies of selected responses were compared using a global dissimilarity (DISS) analysis (Lehman and Skrandies 1980). The DISS between two scalp maps is defined as the RMS difference between the maps after normalizing each map by its GFP. In the current study, the DISS was normalized to range from 0 to 1 (with the original definition, the DISS can range from 0 to 2), where 0 indicates topographic equality and 1 indicates topographic inversion. The DISS was computed at each time point within the time ranges of the N1 and P2 deflections. At each time point, the statistical significance of the DISS was tested by comparing the DISS between the actual scalp maps with an empirical null distribution, generated by randomly exchanging (“permuting”) the scalp maps within participants and calculating the DISS between these new, permuted maps (Murray et al. 2008). Each null distribution was based on 29 = 512 permutations, which is the maximum possible number of unique permutations for nine participants.

The cortical source distributions of the scalp maps were estimated using the iterative sSLOFO (standardized shrinking LORETA-FOCUSS) algorithm (Liu et al. 2005) as implemented in the Brain Electrical Source Analysis (BESA) software (v5.3; BESA, Gräfelfing, Germany). The sSLOFO distributions were calculated with a voxel size of 7 mm and using three iterations of the sSLOFO algorithm. Tikhonov regularization of 0.3 % was applied in each iteration. The threshold below which voxels are eliminated from the source space in each iteration was set to 10 %. Different sSLOFO distributions were compared by measuring the spatial overlap between the distributions after clipping each distribution above 1 % of its maximum. The overlap was measured by counting the number of the voxels contained in both distributions and expressing it as a proportion of the average total number of voxels in each distribution. Like the DISS, the sSLOFO overlap was calculated at each time point within the N1 and P2 time ranges. For statistical analysis, the actual overlap at a given time point was compared with an empirical null distribution, generated with the same permutation procedure as the null distributions for the DISS.

The hemispheric lateralization of the sSLOFO distributions was measured by calculating the lateralization index, LI = (S R − S L)/(S R + S L), of the maximum source strengths across the left and right hemispheres, S L and S R. The lateralization index can range from 0 to ±1, where 0 indicates no lateralization and −1 or +1 indicates total lateralization to the left or right hemisphere, respectively. It was computed at each time point within the N1 and P2 time ranges. At each time point, the difference between the lateralization indices for different conditions was significance-tested by comparing the actual difference with an empirical null distribution, generated with the same permutation procedure as the null distributions for the DISS and sSLOFO overlap.

Lateralization matching experiment

In the EEG experiment, the small ITD was fixed at −250 μs for all participants to create a left-lateralized percept. All other ITDs and ILDs used were set individually for each participant in order to match them in terms of lateralization: the small ILD was set to match the lateralization of the small ITD, and the large ITD and ILD were set to match the lateralization of the combined ITD and ILD. The matching was performed using an adaptive “doublet” procedure (Bode and Carhart 1973; Leek 2001) based on a two-alternative, forced-choice task. Each doublet run comprised two randomly interleaved tracks. Each trial within the two tracks consisted of first the target and then the matching stimulus. The participant's task was to indicate whether the matching stimulus was located to the left or right of the target stimulus. At the beginning of each track, the ITD or ILD of the matching stimulus was set such that it would be lateralized well to the left or right of the target stimulus (±750 μs for ITDs, ±30 dB for ILDs). In the “left” track, the ITD or ILD of the matching stimulus converged according to a two-up, one-down rule in order to track the point where the matching stimulus had a 70.7 % likelihood of being perceived to the left of the target stimulus (Levitt 1971). In the “right” track, the ITD or ILD of the matching stimulus converged according to a two-down, one-up rule to track the point where the matching stimulus had a 70.7 % likelihood of being perceived to the right of the target stimulus. Both tracks consisted of eight reversals in the ITD or ILD of the matching stimulus. ITDs were adjusted in steps of 62.5 μs, and ILDs in steps of 2 dB. The matching ITD or ILD was estimated by averaging the final six reversals in each track and then averaging across the left and right tracks. Three doublet runs were completed for each condition and the results averaged. The matching results are shown in Figure 2.

FIG. 2
figure 2

Results from the lateralization matching experiment. The small ITD, which was fixed at −250 μs, is shown by the grey horizontal line. The small ILD (light grey bar) is the average ILD required to match the lateralization of the small ITD. Its value can be read off the right-hand ordinate. Note that the right-hand ordinate was scaled such that the small ILD appears equivalent to the small ITD. In the combined condition, the small ITD and small ILD were presented together. The black horizontal line shows the expectation that the lateralization of the combined stimulus is the sum of the lateralizations of its ITD and ILD components (small ITD and ILD). The large ITD and ILD (dark grey bars) are the average ITD (left-hand ordinate) and ILD (right-hand ordinate) required to match the actual lateralization of the combined stimulus. The error bars show the standard error across participants; the error bar on the combined condition is the same as that on the small ILD condition.

Participants

Five male and four female participants, aged between 18 and 35 years, were recruited through poster advertisements distributed about the Nottingham University campus. Participants gave written informed consent prior to the experiment and were right-handed according to the Edinburgh inventory (Oldfield 1971). None of the participants reported having any history of audiological or neurological disease. The experimental procedures conformed to the Code of Ethics of the World Medical Association (Declaration of Helsinki) and were approved by the Ethics Committee of the School of Psychology at the University of Nottingham.

RESULTS

The diotic adapter sound elicited a typical triphasic energy onset response (EOR in Fig. 1C), comprising P1, N1 and P2 deflections, as well as a sustained response (SR), upon which the response to the ITD and/or ILD change at the onset of the probe was superposed (change response, CR). The change responses were dominated by the N1 and P2 deflections, with no discernible P1 deflection (Fig. 3). The negative deflection following the P2 appears to be the offset response to the probe (OffR; compare Magezi and Krumbholz 2010). It showed many of the same effects as the N1 and P2 and will thus not be discussed separately.

FIG. 3
figure 3

AE Grand average EEG responses to all stimulus conditions plotted as a function of time relative to probe onset. As in Figure 1C, the thin grey lines show the signals from the 64 recording channels, the red lines show the vertex channel and the solid black lines show the GFP of the response. The dashed vertical lines mark the probe onset.

The morphology of the change responses differed somewhat between the ITD and ILD conditions (Fig. 3). In particular, the N1 deflection was larger and occurred earlier, and the P2 deflection was smaller, in the response to the small ILD than small ITD change. Similarly, the N1 deflection occurred earlier, and the P2 deflection was smaller, in the large ILD than large ITD change response. The statistical significance of these differences was verified by bootstrapping the global field power (GFP) of the difference between the respective responses (Fig. 4; see “METHODS”).

FIG. 4
figure 4

GFP of grand average responses to the small (A) and large (B) ITD and ILD conditions. The grey shading between the ITD (solid lines) and ILD (dashed lines) responses marks the time points where the GFP of the difference between the responses was significantly different from zero (i.e. where its 99 % bootstrap confidence interval did not contain zero).

Source analysis of ITD- and ILD-only responses

In order to test whether the ITD and ILD change responses were generated in different cortical areas, we compared the scalp voltage distributions (referred to as scalp maps) of the average small and large ITD and average small and large ILD responses using a DISS analysis (Lehman and Skrandies 1980). We also estimated the cortical source distributions of these average ITD- and ILD-only responses using the iterative sSLOFO algorithm (Liu et al. 2005; see “METHODS” for details).

Over the time range of the N1, the scalp maps exhibited negative and positive voltage maxima to the right of the vertex and over the right subtemporal cortex, respectively (blue and red highlight in Fig. 5A) and a polarity reversal over the right temporal cortex. For the P2, the voltage maximum at the vertex (which is positive in the case of the P2) was more focal and less lateralized than for the N1 (Fig. 5B). The sSLOFO analysis suggested that the predominant contribution to the N1 was from the right auditory cortex, whereas the P2 received an additional contribution from a more central, non-auditory source, possibly the cingulate cortex (Fig. 5C, D).

FIG. 5
figure 5

Comparative source analysis of average ITD- and ILD-only responses. A, B Scalp voltage maps of the N1 (A) and P2 (B) deflections in the ITD- (left) and ILD-only (right) responses. The maps were taken from the middle of the rising flank of the respective deflection (106 ms for the N1 and 180 ms for the P2). C, D Distributed source estimates based on the scalp maps shown in A and B, projected onto right sagittal (top) and coronal (bottom) slices of a Talairach average brain. The source distributions for the ITD and ILD responses are superposed, with the red highlight showing the distributions for the ITD response and the green highlight showing the distributions for the ILD response; the yellow highlight shows where the two distributions overlap. C The source distributions for the N1, and D for the P2. The thin, white lines show the slice positions. E Global dissimilarity (DISS) between the scalp maps (bold, blue lines) and spatial overlap (Ovlp) between the source distributions (bold, red lines) of the ITD- and ILD-only responses, calculated within the time ranges of the N1 and P2 (highlighted by the white background). The Ovlp was expressed as a proportion of the average total volume of each distribution. The thin, solid and dashed lines show the GFP of the ITD- and ILD-only responses, scaled to the maximum of the ILD-only response. Note that the DISS increases as the response amplitude decreases towards the edges of the N1 and P2 time ranges, because the scalp maps become dominated by noise. The blue- and red-shaded areas represent the null distributions of the DISS and Ovlp, based on 512 permutation samples. The darker shading shows the 5–95 and the lighter shading the 1–99 percentile ranges of the distributions. F Hemispheric lateralization indices (LIs) of the ITD- (bold, light-green lines) and ILD-only responses (bold, dark-green lines). The green-shaded areas represent the null distributions of the difference between the LIs for the ITD and ILD conditions, shifted to center on the average LI between the ITD and ILD conditions. Again, the darker shading shows the 5–95 and the lighter shading the 1–99 percentile ranges.

A permutation test (see “METHODS”) showed that, for the entire time range of the N1, and for the first part (65 %) of the time range of the P2, the scalp maps of the ITD- and ILD-only responses were not significantly different from each other (blue lines and shading in Fig. 5E). Towards the end of the P2, the difference reached significance at a level of p = 0.05, but not at the higher level of p = 0.01 used for all other statistical tests involving multiple comparisons over many time points.

The estimated source distributions of the ITD- and ILD-only responses overlapped by up to 87 % for the N1 and 81 % for the P2 (red lines and shading in Fig. 5E). A permutation test showed that, for the entire time range of the N1, and for the first part (41 %) of the P2 time range, the overlap was not significantly smaller than the overlap between two statistically identical distributions. Towards the end of the P2, the difference reached significance at p = 0.05, but not at p = 0.01.

Both the N1 and P2 showed a considerable degree of lateralization towards the right hemisphere (Fig. 5F). On average, the N1 was up to 2.8 and the P2 up to 2.1 times larger in the right than the left hemisphere. A permutation test showed that, for the entire time range of the N1, and for the first part (69 %) of the time range of the P2, the lateralization indices of the ITD- and ITD-only responses were not significantly different from each other; the difference reached significance at p = 0.05, but not at p = 0.01 towards the end of the P2.

Combined ITD and ILD change response

In order to test whether there was functional interaction between the generators of the ITD and ILD change responses, we compared the response to the combined ITD and ILD change with the responses to the respective ITD- and ILD-only conditions (i.e. small ITD and ILD). Figure 6A shows that the sum of the responses to the small ITD and ILD conditions was almost twice as large as the response to the combined condition. The significance of this difference was confirmed by bootstrapping the GFP of the difference between the two responses. The combined response could be best described as a weighted sum of the small ITD and ILD responses (Fig. 6B). The GFP of the difference between the combined and weighted-sum responses, referred to as “residual”, was minimized when the small ITD response was weighted by 0.63, and the small ILD response was weighted by 0.42. For these weights, the residual was comparable to the noise floor.

FIG. 6
figure 6

A GFP of grand average response to the combined ITD and ILD stimulus (red line). The black line shows the response that would have been expected, if the ITD and ILD components of the stimulus were processed independently (sum of small ITD and ILD responses). The grey shading marks the time points where the GFP of the difference between the expected and actual responses was significantly different from zero. B The combined response (re-plotted in red) was well described by a weighted sum of the small ITD and ILD responses (black) with weights 0.63 and 0.42, respectively. The GFP of the difference between the weighted and combined responses, referred to as residual, is show in blue.

We also calculated the scalp maps and estimated the source distributions of the combined ITD and ILD response and compared them with the scalp maps and source distributions of the average response to the large ITD and ILD conditions (which matched the lateralization of the combined condition). Figure 7A shows that both the DISS and source overlap between the combined and average large ITD and ILD conditions were of a similar order as those between the ITD- and ILD-only responses in the previous section (no permutation test was carried out in this case); the source distributions of the combined and large ITD and ILD responses overlapped by up to 82 % for the N1 and 90 % for the P2. Figure 7B shows that there was no systematic difference in hemispheric lateralization between the responses to the combined and average large ITD and ILD conditions on the one hand, and the average small ITD and ILD conditions on the other hand, despite the almost twofold difference in stimulus lateralization (the combined and large ITD and ILD responses may have been expected to be more strongly lateralized than the small ITD and ILD response).

FIG. 7
figure 7

A Global dissimilarity of scalp maps (DISS; bold, blue lines) and source overlap (bold, red lines) between the combined (Cmb) and average large ITD and ILD (Lrg) responses, plotted as in Figure 5. No permutation analysis was performed in this case. B Lateralization indices of the combined (Cmb), average large ITD and ILD (Lrg), and average small ITD and ILD (Sml) responses (bold, colored lines; see legend). The thin black lines show the GFP of the respective responses, scaled to the maximum GFP of the combined response. The white background highlights the N1 and P2 time ranges.

Large ITD and ILD responses

The lateralization of the combined ITD and ILD was about twice as large as that of the individual cues presented separately (i.e. small ITD and ILD). Nevertheless, the response to the combined cues was barely larger than the responses to the individual cues, indicating a high degree of compression of the combined response size with respect to the stimulus laterality. In order to investigate whether this compression was a specific property of the combined stimulus, or applies more generally to strongly lateralized sounds, we compared the responses to the small ITD and ILD conditions with the responses to the ITD- and ILD-only conditions that matched the lateralization of the combined stimulus (large ITD and ILD). We found that the large ITD and ILD responses also showed compression, in that they were considerably smaller than the responses that would have been expected if there were a linear relationship between stimulus lateralization and response size (Fig. 8). The large ITD response was compressed to 0.69 times the expected linear response, and the large ILD response was compressed to 0.47 times the expected response. The compression was significant in both cases as confirmed by bootstrapping the GFP of the difference between the expected and actual responses. Note that the compression factors for the large ITD and ILD responses (0.69 and 0.47) are remarkably similar to the weighting factors used to model the response to the combined ITD and ILD condition with the small ITD and ILD responses (0.63 and 0.42, respectively). This indicates that compression was not special to the combined response, but applies more generally to strongly lateralized sounds.

FIG. 8
figure 8

A GFP of grand average responses to the small (blue) and large (red) ITD responses. The black line shows the large ITD response that would have been expected, if the response size scaled linearly with the perceived lateralization. It was derived by multiplying the small ITD response with the average ratio between the large and small ITDs (see Fig. 2). The grey shading marks the time points where the GFP of the difference between the expected and actual responses to the large ITD was significantly different from zero. B The same analysis for the ILD conditions.

DISCUSSION

The aim of this study was to investigate whether the two cues for horizontal sound localization, ITDs and ILDs, are processed by separate or integrated codes in the human auditory cortex. For that, we measured the responses to a change in ITD or ILD between an adapter and a probe stimulus and compared them with the response to the combined change. Both the ITD- and ILD-only changes elicited large responses. For the most part of the responses, their scalp topographies were similar and their estimated source distributions were largely overlapping, suggesting that they were generated by overlapping populations of neurons. Both the ITD- and ILD-only responses were strongly lateralized to the right hemisphere, as would be expected, given that the evoking stimuli were lateralized to the left hemifield (e.g. McEvoy et al. 1994; Woldorff et al. 1999; Krumbholz et al. 2005, 2007). Importantly, the degree of hemispheric lateralization was similar, supporting the notion that the differences in hemispheric lateralization between ITD and ILD responses observed by Ungan et al. (2001) and Tardif et al. (2006) were due to differences in stimulus lateralization. There were some differences between the scalp maps and estimated source distributions of the ITD- and ILD-only responses over the falling flank of the P2 deflection. However, these differences have to be viewed with caution, because they resulted from multiple tests over many time points and only carried a low level of significance. Also, the differences coincided with relatively low amplitudes of one of the compared responses (ILD-only), which means that the comparison may have been influenced by noise.

The finding that the response to the combined ITD and ILD change was significantly different from the linear superposition of the ITD- and ILD-only responses suggests that the neuron populations that process ITDs and ILDs are not only overlapping, but also functionally coupled; a linear superposition of the ITD and ILD components would have been expected if ITDs and ILDs were processed by independent neurons. The combined ITD and ILD response was subadditive (i.e. smaller than the linear superposition of the ITD- and ILD-only responses). Similar subadditivity has been observed in the responses to multisensory compared to unisensory stimuli (e.g. Calvert et al. 2001; Calvert and Thesen 2004) and in the responses to binaural compared to monaural sounds (Gaumond and Psaltikidou 1991; Krumbholz et al. 2005; McPherson and Starr 1993; Riedel and Kollmeier 2002). The scalp maps and source distribution of the combined ITD and ILD response were similar to those of the ITD- and ILD-only responses.

Additional measurements with stimuli that matched the lateralization of the combined stimulus suggested that the subadditivity of the combined ITD and ILD response was due to a compressive relationship between stimulus lateralization and evoked response size; while the lateralization of the combined stimulus was almost double that of its ITD and ILD components, the size of the combined response was barely larger than that of each of the individual responses. Recent research suggests that, in humans, sound laterality is represented by a population rate code comprising two opponent populations broadly tuned to the left and right auditory hemifields (Salminen et al. 2009; Salminen et al. 2010; Magezi and Krumbholz 2010; Briley et al. 2013). Within the context of this opponent process model of sound lateralization, the compressive relationship between stimulus lateralization and response size would be assumed to reflect saturation of the opponent population responses towards larger sound lateralities. Neurophysiological recordings in animals suggest that each population response first increases steeply for lateralities close to the midline, but then reaches a broad maximum for larger lateralities (McAlpine et al. 2001; Stecker et al. 2003, 2005). Saturation of opponent population responses would also explain the finding that the combined and large ITD and ILD responses did not show a greater degree of hemispheric lateralization than the small ITD and ILD responses, despite the difference in stimulus lateralization.

The combined ITD and ILD response appeared to be more strongly influenced by the ITD than ILD component (weighting ratio = 0.63:0.42 = 1.5). The current stimuli were filtered to only contain frequencies below 1 kHz, where fine structure ITDs can be perceived. In natural sounds, the perceptual weighting of ILDs tends to increase, and the weighting of ITDs tends to decrease, towards higher frequencies (McPherson and Middlebrooks 2002). However, this is because natural low-frequency ILDs tend to be small (Whightman and Kistler 1992) and natural high-frequency ITDs tend to be less effective at eliciting lateralization than corresponding low-frequency ITDs (Bernstein and Trahiotis 2003), and does thus not mean that the weighting of ITDs and ILDs observed in the current study would necessarily have been different had the stimuli contained higher frequencies. When simulated over headphones, low- and high-frequency ITDs and ILDs can be made equally effective, and it has been shown that this also equalizes their perceptual weighting (Bernstein and Trahiotis 2004, 2005).

The similarity of the scalp topographies and source distributions of the ITD- and ILD-only responses, as well as the subadditivity of the response to the combined ITD and ILD change, suggests that the auditory cortex in humans contains an integrated code of ITDs and ILDs. However, the morphological differences between the ITD- and ILD-only responses, with later and smaller N1, and larger P2, deflections in the ITD-only responses, indicate that the auditory cortex retains at least some degree of independent information about ITDs and ILDs. It is possible, for instance, that the cortical processing of ITDs and ILDs is based on different, but interconnected populations of neurons, located within the same area. Alternatively, ITDs and ILDs may be processed by the same neurons, but these neurons receive input from separate ITD- and ILD-specific sources. The idea of an integrated code of ITDs and ILDs that retains some degree of cue-related information is consistent with findings from auditory psychophysics. For instance, Phillips et al. (2006) have shown that prolonged exposure to an adapting sound with a large ILD can shift the perceived lateral position of a probe sound with an ITD and vice versa. It would be difficult to conceive how such cross-adaptation between ITDs and ILDs could occur unless there were some integrated representations of the two cues. The notion of integrated processing is supported by the finding of Philips et al. (2002) that the auditory saltation illusion, whereby presentation of two sets of clicks at two lateral positions can lead to the illusory perception of a continuous movement of the clicks between the two positions, is immune to switching the lateralization cue (e.g. from ITD to ILD) between the two sets of clicks. The idea of integrated processing of ITDs and ILDs is also consistent with the fact that ITDs and ILDs can, to a certain extent, be offset, or traded, against one another, such that a sound lateralized with an ITD can be centered by applying an opposing ILD and vice versa. However, it has been shown that trading is often imperfect, in that listeners presented with stimuli having opposing interaural time and level differences may perceive a single sound image, but that image may be dominated by one or other cue (Hafter and Jeffress 1968), or they may perceive two images, one corresponding mainly to the ITD, and the other to the ILD (Whitworth and Jeffress 1961). This indicates that the auditory system retains independent information about ITDs and ILDs even up to the level of perception. Independent information about ITDs and ILDs would enable the brain to ascertain how consistent the cues are and thus how likely they would have arisen from the same source. Independent ITD and ILD information would also enable the brain to exploit the full benefit that each cue confers when listening to speech in noisy environments (Edmonds and Culling 2005). It is possible that the degree of integration of ITDs and ILDs is decided upon how consistent the cues are. The ITDs and ILDs used in the current study were highly consistent and that may be why they were processed in an integrated fashion. It is possible that using less consistent, or even opposing, cues would have led to less integration and thus a lesser degree of subadditivity in the EEG response to the combined stimulus.