Abstract
Anticipatory attention results in enhanced response to task-relevant stimulus, and reduced processing of unattended input, suggesting the deployment of distinct facilitatory and suppressive mechanisms. α Oscillations are a suitable candidate for supporting these mechanisms. We aimed to examine the role of α oscillations, with a special focus on peak frequencies, in facilitatory and suppressive mechanisms during auditory anticipation, within the auditory and visual regions. Magnetoencephalographic (MEG) data were collected from fourteen healthy young human adults (eight female) performing an auditory task in which spatial attention to sounds was manipulated by visual cues, either informative or not of the target side. By incorporating uninformative cues, we could delineate facilitating and suppressive mechanisms. During anticipation of a visually-cued auditory target, we observed a decrease in α power around 9 Hz in the auditory cortices; and an increase around 13 Hz in the visual regions. Only this power increase in high α significantly correlated with behavior. Importantly, within the right auditory cortex, we showed a larger increase in high α power when attending an ipsilateral sound; and a stronger decrease in low α power when attending a contralateral sound. In summary, we found facilitatory and suppressive attentional mechanisms with distinct timing in task-relevant and task-irrelevant brain areas, differentially correlated to behavior and supported by distinct α sub-bands. We provide new insight into the role of the α peak-frequency by showing that anticipatory attention is supported by distinct facilitatory and suppressive mechanisms, mediated in different low and high sub-bands of the α rhythm, respectively.
Significance Statement
We investigated the role of α oscillations, with a special focus on peak frequencies, in facilitatory and suppressive mechanisms during anticipation, using magnetoencephalographic (MEG) data collected during an auditory spatial attention task. We show, during anticipation of a visually-cued auditory target, a decrease in α power around 9 Hz in the auditory cortices, simultaneous to an increase around 13 Hz in in the visual regions, the latter significantly correlated with behavioral performances. Within the right auditory cortex, we show a larger increase in high α when attending an ipsilateral sound; and a stronger decrease in low α when attending a contralateral sound. Therefore, anticipatory attention would be supported by distinct facilitatory and suppressive mechanisms, mediated in different low and high α sub-bands, respectively.
Introduction
We spend a large fraction of our time anticipating stimuli (Requin et al., 1991) and to support this behavior, anticipatory attention promotes the processing of upcoming relevant stimuli, resulting in reduced brain responses to unattended inputs and enhanced processing of relevant information (for review, see Hillyard et al., 1998). These modulations of target processing suggest the deployment of distinct facilitatory and suppressive attentional mechanisms during target expectancy, similarly to the inhibitory and facilitatory mechanisms supporting selective attention (de Fockert and Lavie, 2001; Gazzaley et al., 2005, 2008; Bidet-Caulet et al., 2007, 2010, 2015; Chait et al., 2010; Slagter et al., 2016). However, little is known about the potential facilitatory and suppressive attentional mechanisms activated during anticipation of an upcoming stimulus.
Oscillations in the α band, loosely defined between 8 and 14Hz, have been proposed to play a crucial role in anticipatory attention (for review, see Foxe and Snyder, 2011; Frey et al., 2015). Discovered in 1929 by Hans Berger (Berger, 1929), α oscillations were first considered a marker of cortical idling (Pfurtscheller et al., 1996). However, this idea has been challenged with α oscillations being assigned an active inhibitory role in cognitive processing (Klimesch et al., 2007; Jensen and Mazaheri, 2010). The large literature in the visual modality paints a rather dynamical picture in which, during target expectation, α power decreases in visual areas responsible for processing the attended space while α power increases in (1) visual areas responsible for processing the unattended space with or without distracting stimuli (Kelly et al., 2006; Rihs et al., 2007, 2009), and (2) areas responsible for processing unattended modalities (Foxe et al., 1998; Fu et al., 2001; Gomez-Ramirez et al., 2011; Jiang et al., 2015). Therefore, α oscillations would be a suitable candidate for supporting facilitatory and suppressive mechanisms of anticipatory attention.
Interestingly, distinct frequency peaks (sub-bands) in the α band manifest as a function of cortical location and task demand (Haegens et al., 2014). In a similar vein, Mazaheri et al. (2013) compared α activity while participants were cued to either a visual or an auditory target, and found a decrease in α power around 10 Hz in visual regions concomitant to an increase around 15 Hz, in the vicinity of the right auditory cortex. Taken together, these results highlight the importance of considering the frequency peak within the α band.
Contrary to the visual domain, only a handful of studies investigated the impact of anticipatory attention on α modulations in the auditory cortices. A recurrent magnetoencephalographic (MEG) finding is an increased α power, solely in the right auditory cortex, when attention was directed toward the ipsilateral right ear compared to when directed toward the contralateral ear (Müller and Weisz, 2012) or non-spatially oriented (Weisz et al., 2014). These results demonstrate how α oscillations could be involved in the suppressive mechanisms of auditory anticipatory attention (see also Frey et al., 2014; Weise et al., 2016), but do not shed much light on their implication in facilitatory mechanisms.
We aimed to examine the role of α oscillations in attentional facilitatory and suppressive mechanisms during auditory anticipation, within the auditory cortices and also between the visual and auditory regions. For this purpose, we recorded MEG activity during an auditory task in which spatial attention to auditory targets was manipulated by visual cues, either informative or not of the target side. By incorporating spatially uninformative cues, we aimed to delineate facilitating and suppressive mechanisms supporting auditory anticipatory attention (Bidet-Caulet et al., 2010).
We hypothesized that during a spatial attention task, the balance between facilitatory and suppressive mechanisms of auditory anticipatory attention would be indexed by α activity following two main patterns. (1) A decrease in α power (reflecting inhibition release, i.e., facilitation) in task-relevant auditory areas would be concomitant to an increase in α power (reflecting inhibition/suppression) in task-irrelevant visual cortices. (2) Within the right auditory cortex, we expected a decrease in α power when attention is directed toward the contralateral ear and an increase in α power when attention is directed toward the ipsilateral ear, relative to when attention is not spatially oriented (uninformative cues). Also, if distinct suppressive and facilitating attentional mechanisms are activated during anticipation, they should be differentially correlated to behavioral performances. Finally, to gain further insight into the role of the peak-frequency in the α band (Haegens et al., 2014), we aimed to systematically investigate the effect of the frequency peak with the prediction that facilitatory and suppressive attentional mechanisms would be mediated in different α sub-bands.
Materials and Methods
Participants
Fourteen healthy participants (eight females) took part in this study. The mean age was 25 years ± 0.85 SEM. All participants were right handed, and reported normal hearing, and normal or corrected-to-normal vision. All participants were free from any neurologic or psychiatric disorders. The study was approved by the local ethical committee, and subjects gave written informed consent, according to the Declaration of Helsinki, and they were paid for their participation.
Stimuli and tasks
Competitive attention task (CAT)
In 75% of the trials, a target sound (100-ms duration) followed a central visual cue (200-ms duration) with a fixed delay of 1000 ms (Fig. 1). The cue was a green arrow, presented on a gray-background screen, pointing either to the left, right, or both sides. Target sounds were monaural pure tones (carrier frequency between 512 and 575 Hz; 5-ms rise time, 5-ms fall time). In the other 25%, the same structure was retained, however, a binaural distracting sound (300-ms duration) was played during the cue-target delay (50- to 650-ms range). However, for the purpose of this study, only distractor-free trials were analyzed. The cue and target categories were manipulated in the same proportion for trials with and without distracting sound. In 33.3% of the trials, the cue was pointing left and the target sound was played in the left ear, and in 33.3% of the trials, the cue was pointing right and the target sound was played in the right ear, leading to a total of 66.6% of informative trials. In the last 33.3% of the trials, the cue was uninformative, pointing in both directions, and the target sound was played in the left (16.7%) or right (16.7%) ear.
Participants were instructed to categorize two target sounds as either high- or low-pitched sound, by either pulling or pushing a joystick. The mapping between the targets (low or high) and the responses (pull or push) was counterbalanced across participants, but did not change across the blocks, for each participant. to account for the participants’ pitch-discrimination capacities, the pitch difference between the two target sounds was defined in a discrimination task (see below). Participants were informed that informative cues were 100% predictive and that a distracting sound could be sometimes played. They were asked to allocate their attention to the cued side in the case of informative cue, to ignore the distractors and to respond as quickly and correctly as possible. Participants had a 3.4-s (3400-ms) response window. In the absence of the visual cue, a blue fixation cross was presented at the center of the screen. Subjects were instructed to keep their eyes fixated on the cross and to minimize eye movements and blinks while performing the task.
Discrimination task
Participants were randomly presented with one of two target sounds: a low-pitched sound (512 Hz) and a high-pitched sound (575 Hz; two semitones higher), equiprobably in each ear (four trials per ear and per pitch). As described above, participants were asked to categorize the target sounds as either high- or low-pitched sound within 3 s.
Procedure
Participants were seated in a sound-attenuated, magnetically shielded recording room, at a 50-cm distance from the screen. The response device was an index-operated joystick that participants moved either toward them (when instructed to pull) or away from them (when instructed to push). All stimuli were delivered using Presentation software (Neurobehavioral Systems, RRID:SCR_002521). All sounds were presented through air-conducting tubes using Etymotic ER-3A foam earplugs (Etymotic Research, Inc.).
First, the auditory threshold was determined for the two target sounds differing by two semitones (512 and 575 Hz), in each ear, for each participant using the Bekesy tracking method (Von Békésy and Wever, 1960). The target sounds were then presented at 15-dB sensation level while the distracting sounds were played at 35-dB sensation level. Second, participants performed the discrimination task. If participants failed to respond correctly to >85% of the trials, the pitch of the high target sound was augmented, by half a semitone with a maximum difference of three semitones between the two targets (auditory thresholds were then measured with the new targets). Afterward, participants were trained with a short sequence of the CAT. Finally, MEG and EEG were recorded while subjects performed 15 blocks (72 trials each). Each trial lasted from 4.6–4.8 s, leading to a block duration of ∼5 min and a MEG/EEG session of ∼1 h 35 min (breaks included). After the MEG/EEG session, participants’ subjective reports regarding their strategies were collected.
Behavioral data analysis
For behavioral data analysis, a response was considered correct, if it matched the response mapped to the target sound and was executed before the apparition of the following cue. The influence of the factor cue (three levels: left, right and uninformative) on the percentage of correct responses was tested using a linear mixed-effects models [lme4 package (Bates et al., 2014) for R Team, 2014; RRID:SCR_015654].
For post hoc analysis, we used the Lsmean package (Lsmean version 2.20-23; Searle et al., 1980) where p values were considered as significant at p < 0.05 and adjusted for the number of comparisons performed (Tukey method). Incorrect trials were excluded from further analysis, leaving on average (216 ± 6.92 SEM) trials per cue condition per participant. The influence of the cue on the median of reaction times (RTs) of the correct trials were tested using the same tests.
Magnetoencephalography
Recordings
Simultaneous EEG and MEG data were recorded, although the EEG data will not be presented here. The MEG data were acquired with a 275-sensor axial gradiometer system (CTF Systems Inc.) with continuous sampling at a rate of 600 Hz, a 0- to 150-Hz filter bandwidth, and first-order spatial gradient noise cancellation. Moreover, eye-related movements were measured using vertical and horizontal EOG electrodes.
Head position relative to the gradiometer array was acquired continuously using coils positioned at three fiducial points; nasion, left and right pre-auricular points. Head position was checked at the beginning of each block to control head movements.
In addition to the MEG/EEG recordings, T1-weighted three-dimensional anatomic images were acquired for each participant using a 3T Siemens Magnetom whole-body scanner. These images were used for reconstruction of individual head shapes to create forward models for the source reconstruction procedures. The processing of these images was conducted using CTF’s software (CTF Systems Inc.).
Outline of the electrophysiological data analyses
The analyses reported here focused on modulations of oscillatory activity in the α band during top-down anticipatory attention, i.e., during the cue-target delay in trials with no distractor and a correct response. MEG data were pre-processed in the sensor space using the software package for electrophysiological analysis (ELAN Pack; Aguera et al., 2011). Further analyses were performed using Fieldtrip (www.fieldtriptoolbox.org; Oostenveld et al., 2011, RRID:SCR_004849), an open source toolbox for MATLAB (RRID:SCR_001622), custom-written functions and R (www.r-project.org; RRID:SCR_001905).
First, significant modulations of oscillatory activity in the α band after cue onset (cue-related activity) were assessed by contrasting post-cue activity against pre-cue activity in the sensor level time-frequency domain (see below, Sensor-level analysis).
Second, based on the sensor level results, two post-cue and one pre-cue time windows in two distinct frequency bands were chosen for source analyses (see below, Source-level analysis). Based on the results of post-/pre-cue contrast in the source domain, auditory and visual regions of interest (ROIs) were selected for further virtual electrode analysis, i.e., time-resolved estimation of source activity (see below, Defining ROIs and virtual electrodes). From these activities, we then specified the time courses, power spectrum, and the α peak frequency for each virtual electrode (see below, Reconstruction of source activity) and assessed the attentional modulations of the cue-related α activity (see below, Attentional modulation of α activity).
Third, correlation between RTs and cue-target activity was investigated in the sensor (see below, Correlation between α activity and behavioral data: sensor level) and source (see below, Correlation between α activity and behavioral data: source level) domains.
Data pre-processing
Head movements
As participants had an EEG cap on, head movements were relatively more difficult to control, in comparison to standard MEG procedures, where the participant’s head is relatively stabilized to the MEG dewar via an inflatable cushion. Thus, in reference to the first block, head positions in the following blocks exceeded the pre-determined threshold of ±1 cm. This would have compelled us to exclude a huge portion of the trials if all 15 blocks were concatenated together. Therefore, for each subject, data were organized in three groups of five blocks so that, within each group, differences in head positions, recorded at the beginning of each block, did not exceed a threshold of ±1 cm.
It is noteworthy that for data pre-processing and sensor level analysis (described below) trials from the three groups were concatenated. However, for source level and virtual electrode analyses (described below), each group was processed separately, and outputs were eventually averaged.
Pre-processing
Only correct trials were considered for electrophysiological analyses. Data segments contaminated with muscular activity or sensor jumps were excluded semi-manually using a threshold of 2200 and 10,000 femtoTesla, respectively. Independent component analysis was applied on the bandpass filtered (0.1–40 Hz) data to remove eye-related (blink and saccades) and heart-related (ECG) artefacts. Subsequently, components (four on average) were removed from the non-filtered data via the inverse ICA transformation. Data were further notch filtered at 50, 100, and 150 Hz and high-pass filtered at 0.1 Hz.
Cue- and target-related activity
Sensor-level analysis
To investigate the dynamics of α power modulations after the visual cue, the oscillatory power of trials from the three cue conditions all together was calculated using Morlet Wavelet decomposition with a width of four cycles per wavelet (m = 7; Tallon-baudry and Bertrand, 1999) at center frequencies between 5 and 18 Hz, in steps of 1 Hz and 50 ms. Activity of interest (defined between 0 and 2 s post-cue and 7–15 Hz) was contrasted against mean baseline activity (−0.6 to −0.2 s pre-cue) using a nonparametric cluster-based permutation analysis (Maris and Oostenveld, 2007). In brief, this test first calculates paired t tests for each sensor at each time-frequency point, which are then thresholded at p < 0.05. The sum within each cluster (Tsum) is retained, and the procedure is repeated 1000 times on shuffled data in which the condition assignment within each individual swapped randomly. On each permutation, the maximum Tsum (Tmax) is retained yielding a distribution of 1000 Tmax values. From this distribution, the cluster probability of each empirically observed Tsum can be derived. Clusters are labeled as significant with p ≤ 0.05. Please note, that for this test, cluster permutations control for multiple comparisons in time, frequency and sensor space dimensions.
Source-level analysis
To elucidate the possible brain regions underlying the sensor-level α modulations, we have defined two post-cue (0.2–0.6, 0.6–1.0) and one pre-cue (−0.6 to −0.2) time-windows in two different frequency bands (9 and 13 ± 2 Hz). These time-frequency windows have been chosen based on the results from the statistical contrast in the sensor level.
To estimate the brain regions driving activity in these time-frequency windows, we have used the frequency–domain adaptive spatial technique of dynamical imaging of coherent sources (DICS; Gross et al., 2001). Data, from all conditions, within each group of blocks were concatenated, and cross-spectral density (CSD) matrix (−0.7 to 2 s, relative to cue onset) were calculated using the multitaper method with a target frequency of 11 (±4) Hz.
For each participant, an anatomically realistic single-shell headmodel based on the cortical surface was generated from individual head shapes (Nolte, 2003). A grid with 0.5-cm resolution was normalized on a MNI template, and then morphed into the brain volume of each participant. Leadfields for all grid points along with the CSD matrix were used to compute a common spatial filter that was used to estimate the spatial distribution of power for all time-frequency windows of interest per group of blocks. For each participant, these power distributions were averaged across the three groups of blocks. Afterward, Each post-cue window was contrasted against a corresponding baseline pre-cue window using a nonparametric cluster-based permutation analysis (Maris and Oostenveld, 2007). For this test, cluster permutations control for multiple comparisons in the source space dimension.
Defining ROIs and virtual electrodes
The aforementioned source-level analysis provides a snapshot picture of underlying cortical activity. To go a step further, we defined virtual electrodes within ROIs, for the purpose of resolving the time course of activity at the source level. The source space was subdivided into 116 anatomically defined ROIs according to the macroscopic anatomic parcellation of the MNI template using the automated anatomic labeling (AAL) map (Tzourio-Mazoyer et al., 2002). We limited our analysis to four auditory regions; left and right Heschl gyri (HG) and superior temporal gyri (STG) and two occipital regions (left and right middle/superior gyri). For each auditory region; virtual electrodes were defined as the average of five neighboring voxels exhibiting the strongest α power modulations, i.e., highest t values in the source-level baseline contrast in the 0.6 to 1s (relative to cue onset) and 7- to 11-Hz time-frequency window. Same procedure was used for the occipital regions; however, voxels were chosen based on the highest t values in the source-level baseline contrast in the 0.6- to 1-s (relative to cue onset) and 11- to 15-Hz time-frequency window.
Reconstruction of source activity
To get a time-resolved estimation of source activity, we computed the time-frequency signal at the virtual electrode (defined above) level using the LCMV beamformer. Spatial filters were constructed from the covariance matrix of the averaged single trials at sensor level (−0.7 to 2 s, relative to cue onset, 1–20 Hz, λ 15%) and the respective leadfield by a linearly constrained minimum variance (LCMV) beamformer (Van Veen et al., 1997). Afterward, spatial filters were multiplied by the sensor level data to obtain the time course activity at each voxel of interest. Activity was averaged across the five voxels defined for each ROI (see section above) and for each hemisphere. Moreover, activity was averaged across the two auditory ROIs (HG and STG). Thus, limiting our analysis to four ROIs (one auditory and one occipital in each hemisphere).
For each ROI, we subtracted the evoked potential (i.e., the signal averaged across all trials) from each trial. Subsequently, time-frequency power was calculated in the same manner as in the sensor level analysis using Morlet Wavelet decomposition.
To visualize the different profiles observed on both sensor and source levels, α power (computed using Morlet Wavelets) was averaged between 7 and 11 Hz, and between 11 and 15 Hz, for auditory and visual regions, separately, to extract the time course of α activity in these two α sub-bands.
In addition, α power (computed using Morlet Wavelets) was averaged between 0.6 and 1s for each ROI, to extract the power spectrum in each subject. Afterward, individual α peak frequency (iAPF) was defined separately for auditory and visual regions, in each subject. For auditory virtual electrodes, the peak was defined as the frequency with the maximum α power decrease relative to the baseline (−0.6 to −0.2 s pre-cue onset) between 5 and 15 Hz. For visual virtual electrodes, the peak was defined as the frequency with the maximum α power increase relative to the baseline. The median APFs across subjects and hemispheres were 9 and 13 Hz in the auditory and visual virtual electrodes, respectively.
Attentional modulation of α activity
A linear mixed-effects model (lme) was fit to predict modulation of α activity uniquely in auditory virtual electrodes between 600 and 1000 ms (relative to cue onset) with the following factors as fixed effect: (1) cue laterality according to the auditory cortices (three levels: ipsilateral, contralateral, and uninformative); (2) hemisphere (two levels: left and right); and (3) frequency (two levels: 9 and 13 Hz). A random effect was included for each participant and thus allowing us to model variability between participants. The chosen frequencies were the median APFs calculated in the previous analysis (see above, Reconstruction of source activity). Similar to the previous step, for post hoc analysis, we used the Lsmean package.
Correlation between α activity and behavioral data: sensor level
As a final step, and to assess the relationship between the cue-related changes in α power, in the sensor space, and RTs, correlation topographies were created (Mazaheri et al., 2013). First, we performed a trial-by-trial correlation, using non-parametric Spearman tests, in each participant, between RTs and post-cue α power at each time frequency point (between 6 and 16 Hz, by steps of 1 Hz, and between 0 and 1200 ms post-cue onset, by steps of 50 ms) for each sensor, to create topographies of the correlation (Mazaheri et al., 2013). The correlation coefficients were subsequently converted to z values using Fisher’s r- to z-transformation to obtain a normally distributed variable. The statistical significance of the correlations was assessed at the group level with a one-sample t test of the correlation z values at each sensor and each time-frequency point, and then subjected to a cluster-level randomization test to correct for multiple comparisons in the sensor space, time, and frequency dimensions.
Correlation between α activity and behavioral data: source level
To assess the relationship between cue related changes in α power and RTs in source-space, single trial α activity was reconstructed at each grid point using a partial cannonical correlation (PCC) beamformer, a more computationally efficient alternative to the DICS beamformer. Afterward, we performed a trial-by-trial correlation, using non-parametric Spearman tests, in each participant, between RTs and post-cue α power (between 10 and 16 Hz, and between 900 and 1200 ms, according to the sensor level results) at each grid point (Mazaheri et al., 2013). The correlation coefficients were subsequently converted to z values using Fisher’s r- to z-transformation to obtain a normally distributed variable. The statistical significance of the correlations was assessed at the group level with a one-sample t test of the correlation z values at each grid point and then subjected to a cluster-level randomization test to correct for multiple comparisons in the source space dimension.
Power analysis
To demonstrate the statistical robustness of our tests (see above, Behavioral data analysis and Attentional modulation of α activity), we have applied sensitivity power analyses using the G*Power software (Faul et al., 2007, 2009), using a power of 0.8, an α error of 0.05, and correlation of 0.5 among repeated-measures; for all the analysis based on linear mixed-effects models (as an approximation), we ran the sensitivity power analysis for a repeated-measures ANOVA. Results are detailed in relevant sections.
Results
Behavioral analysis
The percentage of correct responses (on average: 96.05 ± 0.73 SEM) was not significantly modulated by the cue category. For the median RTs, as shown in Figure 2, we found a significant main cue effect (F(2,26) = 31.5, p < 0.01, η2 = 0.7). The reported effect size (f; Cohen, 1988) of this test is 1.52 superior to the required effect size of 0.35 as calculated by the G*Power software.
Post hoc tests indicated that participants were faster when the cue was informative (either right or left) in comparison to the uninformative cue (p < 0.01). No significant differences were found between the left and right cue conditions.
Cue- and target-related α activity: sensor level analysis
On contrasting post-cue activity to baseline activity, two profiles centered on two distinct frequencies (9 and 13 Hz; low and high α) were distinguished. In the low α frequencies, a widespread decrease lasted between cue onset and 600 ms (post-cue-onset; early period). Later on, this activity was spatially focused to left temporo-parietal sensors just before the target onset (late period). Simultaneously, in the high α frequencies, the early period displayed an occipitally focalized decrease followed by an increase that spreads to right temporal sensors in the late period (Fig. 3).
Cue- and target-related α activity: source level analysis
Sources of these activities were estimated and contrasted to the baseline window. In the early period (200–600 ms), a general decrease of the low-α can be observed in several occipital, temporal and central brain regions, bilaterally (Table 1). However, in the same time period, at higher α frequency, this decrease was restricted to regions dedicated to visual processing in the occipital and temporal lobes (Table 1). In the late period, the low-α decrease became more restricted to the auditory regions in the temporal cortices, e.g., bilateral HG and STG, and to motor areas (Table 1). However, at higher frequencies, an α increase was maximal in occipital, parietal and temporal regions dedicated to visual processing, and in parietal regions (Table 1).
ROI analysis: α time course and peak frequency
At the virtual electrode level, we were able to confirm that the time-frequency profiles of both auditory and visual ROIs are consistent with the profiles that have been demonstrated at the sensor level (Fig. 4A). We could also confirm, at the virtual electrode level, the frequency differences that have been observed at the sensor level. Indeed, the median α frequency peak across subject was 9 Hz in auditory cortices and 13 Hz in visual cortices (Fig. 4B). Moreover, as can be observed in Figure 4C, these α peak frequencies were well circumscribed within the 7- to 15-Hz α band.
ROI analysis: attentional modulations of α activity
In order investigate the modulation of α activity in auditory virtual electrodes, a lme model was used with three factors: (1) cue laterality according to the auditory cortices (three levels: ipsilateral, contralateral and uninformative); (2) hemisphere (two levels: left and right); and (3) frequency (two levels: 9 and 13 Hz).
The lme model yielded several significant main effects and interactions (listed in Table 2 with interaction of interest in bold font). The highest-order significant interaction of interest is the three-level interaction between cue laterality, hemisphere, and frequency (F(2,26) = 3.07, p = 0.04, η2 = 0.17). The reported effect size (f) of this test is 0.65 superior to the required effect size of 0.28 as calculated by the G*Power software.
To elucidate this interaction, we performed post hoc lme models testing the influence of the cue laterality (three levels: ipsilateral, contralateral, and uninformative) and hemisphere (two levels: left and right), for each frequency (9 and 13 Hz), since we aimed to shed more light onto the role of peak frequencies on α modulations (Fig. 5)
At 9 Hz (low α), only the two-level interaction between cue laterality and hemisphere (F(2,26) = 5.2, p = 0.005, η2 = 0.17) reached significance. The reported effect size (f) of this test is 0.45 while the required effect size as calculated by the G*Power software was 0.23; 2 by 2 post hoc testing revealed that in the right hemisphere (auditory cortex), α power was significantly lower in the contralateral cue condition, in comparison to the ipsilateral and uninformative cue condition (p = 0.004 and p = 0.01, respectively). No significant effects were found in the left hemisphere (Fig. 5). In summary, a facilitatory effect on the low α power was found in the right auditory cortex for the contralateral cue.
At 13 Hz (high α), only the two-level interaction between cue laterality and hemisphere (F(2,26) = 4.95, p = 0.007, η2 = 0.1) reached significance. The reported effect size (f) of this test is 0.33 while the required effect size as calculated by the G*Power software was 0.28; 2 by 2 post hoc testing revealed that in the right hemisphere (auditory cortex), α power was significantly higher in the ipsilateral cue condition, in comparison to the uninformative cue condition (p = 0.007), but not to the contralateral cue condition (p = 0.16). No significant effects were found in the left hemisphere (Fig. 5).
In summary, a suppressive effect on the high α power was found in the right auditory cortex for the ipsilateral cue.
Correlation between α activity and behavioral data
At the sensor level, pre-target activity between 0.9 and 1.2 s (relative to cue onset) in the 10- to 15-Hz frequency band at a cluster centered around right occipital and parietal sensors was found to negatively correlate with RTs (p = 0.001). In other words, the higher individual α power in that cluster, the faster the participant. At the source level, α activity between 0.9 and 1.2 s (relative to cue onset) and 10–16 Hz, mainly in the left and right superior occipital gyri, the left middle occipital gyrus, the right calcarine, and the right postcentral gyrus, was found to negatively correlate with RTs (p = 0.01; Fig. 6).
Discussion
In this study, we have demonstrated that (1) anticipating a visually-cued auditory target differentially modulates α power in the auditory and visual cortices; (2) these modulations occur within different α sub-bands; (3) modulations in the right auditory cortex (facilitation and suppression) also occur within different α sub-bands; and (4) RTs to the auditory target correlate with the α power increase in the visual cortices.
Behavioral measure of top-down anticipatory attention
Participants identified the target pitch faster in trials with an informative cue, in agreement with several previous studies (Golob et al., 2002; Bidet-Caulet et al., 2015). This effect is more likely to be related to differences in anticipatory attention since the informative cue provided additional information solely about the location of the target and not about its category neither its mapped response, thus motor preparation was equivalent across all conditions.
Distinct profiles of α activity in visual and auditory regions
In line with our hypothesis, anticipating an auditory target modulated α power differently in the auditory and visual cortices, following different patterns. In the auditory cortex, after the visual cue onset, low-frequency α (∼9 Hz) power continuously decreased until target onset. Simultaneously, in the visual cortices, a transient decrease in low and high-frequency (∼13 Hz) α power between 200 and 600 ms post-cue onset was followed by a power increase, uniquely in high α, before target onset.
According to recent hypotheses, α oscillations reflect regulation of cortical excitability (Klimesch et al., 2007; Jensen and Mazaheri, 2010; Foxe and Snyder, 2011). This gauge would be supported by α power increases in task-irrelevant regions and by α power decreases in task relevant regions. In line with previous findings in the visual (Sauseng et al., 2005; Thut, 2006), somatosensory (Haegens et al., 2012), auditory (Müller and Weisz, 2012; Weisz et al., 2014), and audiovisual (Mazaheri et al., 2013; Frey et al., 2014; Van Diepen and Mazaheri, 2017) domains, we have found that anticipating an auditory target resulted in (1) a decrease in α power, possibly leading to increased excitability, in task-relevant auditory cortical regions, simultaneous to (2) an increase in α power, probably reducing excitability, in task-irrelevant visual regions.
Top-down modulation of α activity in the auditory cortex
A scant literature (Gomez-Ramirez et al., 2011; Weisz et al., 2011, 2014; Müller and Weisz, 2012; Frey et al., 2014; Weise et al., 2016), mostly using MEG, exists on α generators in the auditory cortices, probably due to the limitations of EEG technique to capture their activity (Frey et al., 2014). In the present study, using MEG, we could show not only that α activity, in the auditory cortices, is modulated according to the visual cue information, but also that these modulations occur within different α sub-bands.
In the auditory cortices, to optimize the processing of an upcoming monaural sound, two phenomena might be expected: (1) an inhibition (increase in α activity) in the auditory cortex ipsilateral to the attended side, and (2) a pre-activation (or released inhibition, i.e., decrease in α power) in the contralateral auditory cortex. The question is: which of these two modulations (down- or upregulation) would drive anticipatory attention? By incorporating an uninformative cue condition, we could delineate these facilitatory and suppressive mechanisms.
We observed α power modulations according to the visual cue information, in the right auditory cortex, only. At lower α frequencies (∼9 Hz), we found a decrease in α power (relative to the baseline), in the three cue conditions (contralateral, ipsilateral and uninformative). Importantly, this decrease was most prominent when a contralateral sound was expected rather than an ipsilateral or a spatially non-cued sound. On the other hand, at higher α frequencies (∼13 Hz), an increase in α power (relative to the baseline) was observed in all conditions. Interestingly, this increase was more prominent when an ipsilateral, rather than a spatially non-cued target was expected.
The present results corroborate previous findings (Müller and Weisz, 2012; Weisz et al., 2014) showing that the right auditory cortex plays a special role in auditory spatial attention. We extend these findings by demonstrating that the excitability of the right auditory cortex can be both (1) downregulated for processing an ipsilateral right-ear sound and (2) upregulated for processing a contralateral left-ear sound. Importantly, to our knowledge, the present study is the first one to demonstrate that these modulations occur at different α frequencies, suggesting that the dynamic equilibrium between suppressive and facilitatory mechanisms of auditory anticipatory attention would be supported by different high and low α sub-bands, respectively.
Finally, α activity in the left auditory cortex was not modulated by top-down attention. This asymmetry could be interpreted in the light of the right hemispheric specialization in pitch processing (Milner, 1962; Zatorre and Belin, 2001; Zatorre et al., 2002; Lattner et al., 2005; Hyde et al., 2008). Since participants performed a pitch categorization task, the right auditory cortex would be more relevant for target sound processing and thus more influenced by top-down attention. The asymmetry of α activity modulations could also be interpreted in the light of the right hemispheric dominance in spatial attention that has been illustrated for the auditory (Zatorre and Penhune, 2001; Spierer et al., 2009) and visual (Nobre et al., 1997; Corbetta and Shulman, 2002) modalities. This dominance would reflect a functional asymmetry in auditory processing, wherein the left auditory cortex preferentially processes sounds within the contralateral egocentric space, whereas the right auditory cortex processes the entire acoustic space (Spierer et al., 2011).
Correlation between α activity and behavioral performances
We found that the higher α power in the occipital cortices, the faster participants correctly discriminated the upcoming target sound. In other words, the stronger inhibition of task-irrelevant regions, the faster the subjects. This result is in line with previous findings that behavioral performances correlate with the increase in α power (Haegens et al., 2012) and reinforces the hypothesis that α oscillations exert an inhibitory role (Jensen and Mazaheri, 2010; Klimesch, 2012). Importantly, this correlation between an increase in α power in irrelevant brain regions and behavior was only found significant in the higher α frequencies (10–15 Hz), bringing further evidence for a specificity of the high α sub-band in suppressive attentional mechanisms.
Contradictory to the present findings, a positive correlation between α power in the auditory cortices and RTs in a sound discrimination task was found in a previous study (Mazaheri et al., 2013). However, differences between the two studies might explain this discrepancy. First, in their study, spatial attention was not modulated, i.e., the auditory target was always binaural. Second, participants discriminated three auditory target frequencies that were further apart in pitch and much easier to discriminate in comparison to our paradigm (250, 1000, and 4000 Hz vs 512 and 575 Hz). We posit that in the case of an easy task the excitability of relevant areas can be up and down regulated and correlate with task performance; whereas in the case of a difficult task, the excitability of relevant areas would be maximal and only the inhibition of signal dispersion to irrelevant areas could fluctuate and correlate with performance.
The role of different α frequency sub-bands
The present study highlights specificities of low and high α sub-bands: (1) the peak frequency of the α increase in visual regions was found to be higher (∼13 Hz) in comparison to that of the α decrease in auditory regions (∼9 Hz); (2) the α increase in visual regions was found to be significantly correlated to behavior in the high α frequencies, only; (3) in the right auditory cortex, a larger decrease in α power during contralateral sound expectation was found in the low α, whereas a stronger increase in α power during ipsilateral sound anticipation was found in the high α. The existence of different sub-bands of the α rhythm is not a new concept (Klimesch et al., 1993, 1999; Sauseng et al., 2005; Groppe et al., 2013), but their functional role is still unclear. Recently, α generators have been observed in each of the cortical laminae (Haegens et al., 2015) in primary sensory cortices. Interestingly, the α activity seems to peak at different frequencies according to the layers, providing neuronal underpinnings to different α sub-bands. In the present study, the differences observed across frequencies can be interpreted differently by considering the α peak frequency as a “trait” or “state” variable (Haegens et al., 2014), providing information into their functional role, as discussed in the following.
The present results of different dominant frequencies in the visual and auditory regions are in line with evidence from previous studies demonstrating that α peak frequency varies as a function of cortical location (Kawasaki et al., 2010; Haegens et al., 2014). α Peak frequency could be considered as a trait or a “characteristic” variable that changes across individuals (Klimesch, 1999; Başar, 2012) and cortical regions, as found during resting state, in parietal and occipital regions (Haegens et al., 2014). In this light, the differences in α peak frequency reported here might be related to anatomic and physiologic disparities between the visual and auditory cortices. However, one should note that no difference in α peak frequencies was found between the macaque auditory, visual and somatosensory primary areas (Haegens et al., 2015).
Nonetheless, the present findings also show an increase in high α power, when attending an ipsilateral sound, in the right auditory cortex. This is in agreement with the results of Mazaheri et al. (2013) pointing to an α activity increase in the vicinity of the auditory cortices to be centered around higher α frequencies. Therefore, α peak frequency could also be considered as a state variable that would index performance fluctuations, cognitive demands and probably the functional task-relevance of a certain cortical region (Klimesch, 1999; Başar, 2012; Haegens et al., 2014). The present results show that suppressive attentional mechanism in the visual non-relevant regions are indexed by an increase in high α power which is correlated to behavior. Moreover, within the right auditory cortex, suppression (downregulation) of brain activity when attending an ipsilateral sound is reflected in the high α sub-band; whereas brain processing facilitation of the contralateral expected sound is indexed in the low α sub-band. Taken together, the present results highly suggest that different high and low α sub-bands would support suppressive and facilitatory mechanisms of anticipatory attention, respectively.
Conclusion
The current study replicates and extends previous findings of the presence of α generators in the auditory cortices and of the right hemispheric dominance of auditory spatial attentional modulations.
Importantly, the present work provides evidence of distinct facilitatory and suppressive mechanisms supporting anticipatory attention. These two attentional mechanisms have distinct timing in task-relevant and task-irrelevant brain areas, are differentially correlated to behavior, and are supported by different sub-bands of the α rhythm.
Therefore, the present findings provide new insight into the role of the peak-frequency in the α band by showing that anticipatory attention is a dynamic process supported by a balance between facilitatory and suppressive mechanisms, which would be mediated in different low and high sub-bands of the α rhythm, respectively.
Acknowledgments
Acknowledgements: We thank Sebastien Daligault and Claude Delpuech for technical assistance with the acquisition of electrophysiological data.
Footnotes
The authors declare no competing financial interests.
This work was supported by the French National Research Agency (ANR) Grant ANR-14-CE30-0001-01 (to A.B.-C.). This work was performed within the framework of the LABEX CORTEX (ANR-11-LABX-0042) and the LABEX CELYA (ANR-10-LABX-0060) of Université de Lyon, within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) operated by the French ANR.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.