Abstract
EEG phase is increasingly used in cognitive neuroscience, brain–computer interfaces, and closed-loop stimulation devices. However, it is unknown how accurate EEG phase prediction is across cognitive states. We determined the EEG phase prediction accuracy of parieto-occipital alpha waves across rest and task states in 484 participants over 11 public datasets. We were able to track EEG phase accurately across various cognitive conditions and datasets, especially during periods of high instantaneous alpha power and signal-to-noise ratio (SNR). Although resting states generally have higher accuracies than task states, absolute accuracy differences were small, with most of these differences attributable to EEG power and SNR. These results suggest that experiments and technologies using EEG phase should focus more on minimizing external noise and waiting for periods of high power rather than inducing a particular cognitive state.
Significance Statement
EEG phase is a neural signal related to many moment-to-moment behaviors and has consequently been used to inform brain–computer interfaces and closed-loop stimulation devices. However, prior research and demonstrations have forced the user to be in a single cognitive state, such as rest, making it unclear how EEG phase can apply to the varied contexts that real individuals are placed under. The current study showed that EEG phase can be consistently well predicted across different cognitive contexts after accounting for EEG power and signal-to-noise ratio. These findings represent an important next step for both understanding the cognitive and neurobiological correlates of EEG phase and optimizing EEG-based devices to administer more effective interventions.
Introduction
Cognition relies on cyclical processes, from recurrent cycles that occur every day (Madl et al., 2011) to cycles that occur multiple times a second to process sensory information (Blatter and Cajochen, 2007). Neural oscillations, as measured through electroencephalography (EEG), are related to these cyclical cognitive processes (Ward, 2003; Buzsáki and Draguhn, 2004). This relationship is supported by evidence that abnormal oscillations are a marker of many psychological disorders, such as depression, anxiety, and obsessive-compulsive disorder (Allen and Reznik, 2015; Thibodeau et al., 2006; Velikova et al., 2010).
A commonly used property to characterize EEG dynamics is spectral power. However, power is a time-averaged metric that commonly integrates seconds to minutes of EEG activity. Moreover, power differences are even compared over longer timescales, from seconds to years (Gao et al., 2020; Panagiotopoulou et al., 2022; Xiao et al., 2018). In contrast, many cognitive functions occur on millisecond timescales, suggesting that power fluctuations may not capture all the EEG dynamics that relate to moment-to-moment cognitive performance.
Characterizing the instantaneous properties of neural oscillations may allow us to better track cognition and behavior in real time (Lundqvist and Wutz, 2022). One instantaneous property of an oscillation is its phase, that is, the stage in its cycle at a given moment. Phase has been suggested to shape local neuronal communication by aligning the optimal output and input windows of sending and receiving neurons (Buzsáki and Wang, 2012; Canolty and Knight, 2010; Fries, 2005). As communicating neuronal populations pair to become phase locked, a predictable and consistent rhythm of communication occurs in the form of oscillations. The importance of phase coherence and activity is not limited to activity observed in local neuronal populations. It has also been demonstrated in EEG, which captures the macroscopic voltage field generated by thousands to millions of neurons (Nunez and Srinivasan, 2006). Crucially, the microscopic and macroscopic electrical activity is linked, as rhythmic EEG activity predicts local neuronal spiking (Mazzoni et al., 2010; Snyder and Smith, 2015) and cortical excitability (Bergmann et al., 2012; Massimini et al., 2003; Thies et al., 2018). As a result, the instantaneous phase of global EEG activity may mark periods of heightened corticospinal excitability and synchrony.
Accordingly, many behaviors demonstrate cyclical relationships that align with EEG phase. Single-trial analyses of EEG signals demonstrate a relationship between instantaneous phase and sensory thresholding through tactile perception (Ai and Ro, 2014) and luminance detection (Busch et al., 2009). Other studies have also demonstrated relationships between EEG phase and higher-level cognitive functions, such as attention (Dugué et al., 2016; VanRullen, 2018), decision-making (Wyart et al., 2012), and working memory (Siegel et al., 2009). Researchers have since leveraged these EEG phase–behavior relationships in closed-loop stimulation technologies (Tervo et al., 2022; Zrenner et al., 2018) and have explored EEG phase as a potential target in brain–computer interfaces (BCIs; Brunner et al., 2006; Hsu, 2015; Vigué-Guix et al., 2022).
It is important to note that EEG phase–behavior relationships are still under question as recent replication attempts have shown some null results (Keitel et al., 2022; Ruzzoli et al., 2019). Further understanding of the potential links between EEG phases and behaviors will need to come from both theoretical and empirical studies. Theoretically, postulating how brain rhythms are generated and interact with each other within the broader neurobiological context will be critical. Empirically, EEG phase–behavior experiments need stronger and more robust causal demonstrations. Closed-loop BCIs are an attractive candidate for these demonstrations because of the strong limitations they impose on phase–behavior experiments. As behavior is manipulated in real time, a closed-loop BCI must have its preprocessing parameters and features of interest chosen beforehand. Furthermore, closed-loop BCI experiments allow for precise control of stimulus presentation based on brain states, allowing us to draw stronger causal inferences in EEG phase–behavior relationships (Ramot and Martin, 2022; Vigué-Guix et al., 2022).
Real-time phase–behavior studies and phase-targeted interventions depend on accurate and precise instantaneous phase estimates. Better phase estimation improves targeting accuracy and precision, which decreases statistical variance and directly increases the statistical power of these studies (Cohen, 1998). These improvements can lead to multiple theoretical and practical benefits. Accurate targeting could causally reveal how different neuronal populations and processes interact to result in the measured phase. Additionally, these results may be used to enhance phase-dependent closed-loop neuromodulation devices that depend on EEG phase (Brunner et al., 2006; Hsu, 2015; Tervo et al., 2022; Vigué-Guix et al., 2022; Zrenner et al., 2018). Several studies have shown that accurate phase targeting depends on dynamic EEG properties, most notably the instantaneous amplitude and the signal-to-noise ratio (SNR) of the target oscillation, defined as the ratio of powers between the signal of interest and background noise (McIntosh and Sajda, 2020; Shirinpour et al., 2020; Zrenner et al., 2020). These studies consistently found that higher amplitude and SNR gave higher phase prediction accuracy, as stronger signals are easier to detect and predict. However, all these investigations were performed during one cognitive state, whether it was the resting state, or a state induced by a particular cognitive task. For phase-dependent stimulation to be deployed in a variety of contexts, it should either be robust to changes in a cognitive state or designed to account for cognitive-state related factors that may affect predictive performance.
Cognitive states can affect EEG phase prediction accuracy either through direct changes in the frequency of interest (Zrenner et al., 2020), or through indirect changes in the surrounding background noise (He, 2014; Pathania et al., 2021). Furthermore, these two factors are linked, with direct changes in one frequency band, specifically alpha, causing changes in the background excitatory, inhibitory, and high-frequency activity (Iemi et al., 2022; Peterson and Voytek, 2017). There is also evidence that different thalamic and cortical alpha generators are differentially activated and coupled under different task conditions (Halgren et al., 2019; Lundqvist and Wutz, 2022; Saalmann et al., 2012). These highly dynamic changes in EEG activity necessitate more flexible analysis tools that can account for these changes, such as phase or burst analysis (Lundqvist and Wutz, 2022). However, these tools can be further improved by directly incorporating the dynamics of cognitive state.
The most basic distinction in cognitive state is usually made between rest and task states (Buckner et al., 2008; Raichle, 2015). Cognitive rest and task states correlate with neurobiological markers. For instance, one of the oldest and most consistent observations in EEG research is the increased bursts of parieto-occipital alpha power that occur when subjects relax and close their eyes (Rusiniak et al., 2018). Alpha waves have since been associated with the inhibition of task-irrelevant brain regions (Foxe and Snyder, 2011; Klimesch, 2012). Accordingly, parieto-occipital alpha waves are a particularly useful biomarker of rest because they are easily identifiable and sensitive to changes in cognitive state (Li, 2010; Pathania et al., 2021). However, how these manifested oscillatory changes can modify phase-prediction accuracy has not been quantified.
In this work we compared EEG phase prediction accuracy in the occipital alpha band between rest and task states. To address the varied contexts in which phase prediction algorithms may be applied, we opted to analyze publicly available datasets. These datasets differed in their populations, EEG recording devices, and protocols. These inconsistencies represent some of the environmental variability that phase prediction algorithms would be exposed to in BCIs and research applications. The study used rest and task datasets obtained from two online repositories [OpenNeuro and Open Science Foundation (OSF)] and direct solicitation from researchers. The datasets used tasks aimed at a variety of cognitive domains (vigilance, executive attention, decision-making, and working memory).
We used the Educated Temporal Prediction (ETP) algorithm for phase prediction (Shirinpour et al., 2020), a parameter-free, fast, and accurate phase prediction algorithm. The ETP algorithm uses a short training period where it learns statistical properties of an individual’s EEG waveform, which are then used to make predictions. Using the predictions derived from the ETP algorithm, we compared the effect of rest and task states on EEG phase prediction accuracy. We defined accuracy as the closeness of prediction between the ground-truth waveform and the targeted phase (0°); an accuracy of one indicated that the estimated phase matched the target phase, whereas an accuracy of zero indicated that the estimated phase was at the opposite phase (180° apart).
We intended first to determine which cognitive state led to higher prediction accuracy and second to understand the underlying factors that drove changes in prediction accuracy. We hypothesized that the eyes closed resting-state would have the highest prediction accuracy, as high SNR alpha waves are the most prominent in that state. We further hypothesized that all the differences in prediction accuracy between states would be accounted for by SNR and band power changes because these two factors will be increased in the eyes closed resting-state.
Materials and Methods
Dataset selection procedures
General criteria
We conducted a systematic search for datasets in the public databases of OpenNeuro and OSF in Spring of 2021 using the search term EEG. We also obtained two datasets not available on these websites through direct solicitation and referrals. At a minimum, we required the datasets to measure scalp EEG as opposed to intracranial recordings or magnetoencephalography to ensure that we were measuring the same physical signal. We also required EEG recordings to have at least 16 channels with full head coverage arranged using the 10–20 or 10–10 international system. Although the focus of this study was on the occipital alpha frequency band, we required complete head coverage to facilitate any future experiments on different regions and frequency bands.
To be considered for the task portion of our analysis, we required that the datasets involve at least one behavioral task that required a response based on stimuli. Trials also needed clear event markers to facilitate epoching. For the resting-state datasets, participants needed to be idle with either their eyes closed or eyes opened while relaxed and fixated on an object in front of them.
Cognitive domain criteria
We included tasks using various cognitive domains. We categorized each task based on the main cognitive function its conditions were meant to manipulate. Although we did not include these categories in our statistical model, we were interested in including a broad variety of task data. We classified each task as follows.
Vigilance
To measure vigilance, we looked for datasets with a psychomotor vigilance task (PVT), a reaction-time-based task that measures participants’ speed in detecting a visual stimulus. We included only PVT datasets that used a simple alerting stimulus (no letter stimuli that used language faculties or competing stimuli that required choices). A representative example of a vigilance task comes from Wong et al.’s (2018) experiment in which participants stared at a fixation cross and reacted as quickly as possible to the appearance of a millisecond counter. A key feature of the PVT task is jittering of the interstimulus interval to prevent anticipation effects (the interstimulus interval was between 2 and 10 s).
Executive attention
Executive attention tasks are typically rule based and require individuals to select a correct response among distractors. A representative example of an executive attention task comes from Foster et al.’s (2021) experiment. In the study, participants were initially alerted to a trial with a fixation dot. A cue was then presented, which indicated the location of subsequent bull’s-eyes. Depending on the condition, patients were to covertly monitor this cued location for a contrast change in the bull’s-eyes or overtly monitor the central fixation for a decrease in contrast in the fixation dot.
Decision-making
Decision-making tasks force users to make decisions under uncertainty. The uncertainty can arise from stimulus identity (perceptual uncertainty) or from uncertain outcomes (outcome uncertainty). The current project included dataset(s) testing the latter, where participants were expected to learn the statistical properties of their decision outcomes throughout a task. A representative example of a decision-making task comes from Cavanagh’s (2015) experiment, which consisted of three slot machines whose payouts were scheduled periodically, so that at any one time, a single slot machine would have the best payout. Participants were expected to learn the reward schedule and choose the slot machines that maximized their payout.
Working memory
Working memory tasks involve storing and manipulating information internally. The current project included datasets testing visuospatial working memory, where participants stored and made decisions based on visual information about a stimulus. A representative example of a working memory task comes from Oh et al.’s (2019) experiment in which participants were shown 20 bars, all oriented in the same direction or in varying orientations. After a 1500 ms retention interval, participants saw a test probe and determined whether it was previously present.
Datasets
We obtained a total of 11 datasets. Across all datasets, we had 578 participants. For the resting-state condition, we had a total of 436 participants. For the task set condition, we had a total of 178 participants. The 36 participants from the vigilance task appeared in both task and rest datasets.
After excluding some datasets during our EEG preprocessing steps (see below, EEG preprocessing), we had 484 participants and 1,518, 674 EEG predictions (156,512 eyes closed condition, 170,528 eyes open condition; 1191,634 task condition). A full list of the datasets is provided in Tables 1 and 2.
Description of task datasets used along with number of participants used for the initial analysis
Description of rest datasets used along with number of participants used for the initial analysis
Educated temporal prediction algorithm
An important factor in phase prediction is the speed of the algorithm because real-time algorithms have limited time to complete computations. Prior methods involve forward prediction in the time domain using autoregressive methods (Zrenner et al., 2018) or in the frequency domain using Fourier-based approaches (Mansouri et al., 2017). The current analysis used the Educated Temporal Prediction (ETP) Algorithm (Shirinpour et al., 2020) to predict EEG phase. Unlike autoregressive models, ETP is parameter free and does not require as much computational power. Fourier-based approaches also do not require many parameters and are computationally faster than autoregressive models, but they assume periodicity and harmonicity of the underlying signal. The ETP algorithm (Shirinpour et al., 2020) obviates these limitations by using a short training period where it learns statistical properties of an individual’s EEG waveform. The algorithm requires minimal assumptions about the underlying signal and has been shown to perform more quickly and accurately than the other approaches. A very similar approach has been used by Vigué-Guix et al. (2022) in a real-time closed-loop BCI context that performed similarly to other estimate approaches. The main difference in their approach was basing predictions on the individual alpha frequency instead of the average interpeak interval, the distance between consecutive peaks.
The first stage of the ETP algorithm is a training phase (Fig. 1). In the training phase, the raw time series data are bandpass filtered to the alpha range (8–13 Hz) by a Hamming-windowed finite impulse response filter (FIR; see below, EEG preprocessing). This filtered time series is used to compute the interpeak interval, which was used to predict future peaks. Consequently, we split each dataset into two parts (see below, Epoch extraction), with one part used to learn the average interpeak interval (training data) and the other part used to test this interval for prediction (test data).
Diagram of the ETP algorithm. Left, The first phase is a training phase, where the average interpeak interval is learned from resting-state data. Right, In the testing phase, this average interpeak interval can be added to the last seen peak to predict future peaks. We can then compare ground-truth angles computed from the Hilbert transform with the expected angle. The phase angle of a peak corresponds to 0°, so we expect predicted phase angles to be close to zero as indicated by the polar histogram.
The second phase of ETP is the test phase. In the test phase, a 500 ms window is slid over the test data time point by time point. Each window of data are filtered to the alpha range via a Brickwall filter. Local maxima are designated as peaks in this window. If no peaks are found, the algorithm moves on to the next window. The latency of the next peak is then predicted as the time of the last seen peak plus the average interpeak interval learned from the training data.
Once the whole dataset has run through ETP, a comparison ground-truth version of the time series is computed to establish the accuracy of the predictions. Like the procedure in the ETP training phase, the test time series is filtered in the alpha range via a Hamming-windowed FIR filter. The instantaneous phase of the filtered time series is then extracted via a Hilbert transform. Comparing the times predicted as peaks (0°) by ETP to the ground-truth instantaneous phase at those times results in an accuracy distribution.
Although ETP originally learned the average interpeak interval from separately recorded resting-state data, we opted to use intertrial intervals to train the ETP algorithm for our task datasets because all but one task dataset did not have corresponding resting-state data. Accordingly, all our task datasets were epoched to separate intertrial intervals from on-trial intervals.
We used the publicly available version of the ETP algorithm (Shirinpour et al., 2020) with slight modifications to use individual peak alpha frequencies (described below) and to run on epoched datasets. (The original algorithm expects continuous data.) Although we wrote our main EEG preprocessing suite in EEGLAB software, any additional filtering used by the ETP algorithm used the Fieldtrip toolbox (Oostenveld et al., 2011), version 20201214, to ensure consistency with the original publicly available version of the ETP algorithm. The ETP algorithm also applies spatial filtering via the Laplacian montage (Gordon and Rzempoluck, 2004), using this spatially filtered signal for prediction. For our Laplacian montage, we chose to center on electrode Pz, with surrounding electrodes Oz, Cz, P4, and P3 as parieto-occipital electrodes exhibit the highest alpha power (Fig. 2).
EEG electrode placement and surface Laplacian. a, Diagram of the 10–10 system used in the current analysis. The electrodes used in the Laplacian montage are colored with the central electrode (Pz) in red, and the surrounding electrodes (Oz, Cz, P4, and P3) in yellow. b, c, Applying the Laplacian montage allows us to see higher spatial frequencies. Scalp topographies in the alpha band (8–13 Hz) are displayed without (b) and with (c) the Laplacian montage applied.
Dataset curation procedures
All EEG preprocessing was done using EEGLAB version 2021.0 (Delorme and Makeig, 2004) on MATLAB R2020b (MathWorks).
EEG preprocessing
EEG preprocessing methods can significantly alter the results (Luck, 2014). To ensure consistency in our preprocessing steps, we included only datasets we could gain access to, either DC recordings or recordings with only a wideband filter (with a low-pass filter no lower than 60 Hz) and/or notch filter applied (at 50 Hz or 60 Hz). Datasets with any other preprocessing steps already applied were excluded.
To enable fair comparisons, we downsampled all datasets to the lowest sampling rate in the set (250 Hz) and included only the channel locations common across all the datasets (the 32 standard channels of the 10–20 system). Before ETP analysis we applied an additional Hamming window FIR bandpass from 0.15 to 60 Hz to ensure all datasets contained the same frequency band and to remove DC drift. We did not apply an additional notch filter, as line noise frequencies (50 and 60 Hz) are significantly above our frequency band of interest.
Bad channel and artifact rejection
Bad channels usually occur because of poor connections between the electrode and scalp. Bad channel rejection is a common step in preprocessing pipelines (Bigdely-Shamlo et al., 2015; Kothe and Makeig, 2013).
Therefore, we included an automatic bad channel rejection algorithm in our pipeline, implemented by the clean_rawdata plug-in in EEGLAB, which uses methods from the BCILAB toolbox (Kothe and Makeig, 2013). We used the default settings to remove channels that (1) had a correlation <0.85 to a reconstruction of itself based on surrounding channels, (2) had high-frequency noise 4 SDs above the total channel population, and (3) had flat data for >5 s. Although 4 SDs above the population mean is aggressive, it is consistent with the default settings of the clean_rawdata plug-in and has precedence in the BCI literature and other preprocessing pipelines that use 4 or 5 SDs (Bigdely-Shamlo et al., 2015; Chang et al., 2020). We also excluded channels that were not related to brain signals, such as electrooculogram and trigger channels. To reject transient artifacts, we removed epochs that contained an amplitude exceeding three times the SD of the dataset.
Measuring signal-to-noise ratio and individual alpha frequency
Environmental and intersubject variability can significantly alter brain behavior, which can manifest as differences in EEG signals (Farzan et al., 2017). To quantitatively model differences in signal quality across recordings, we decided to include SNR in our statistical models. We used the approach of Zrenner et al. (2020) in computing SNR because of its simplicity and focus on real-time phase estimation. We conducted spectral analysis of the whole recording using Welch’s method (Welch, 1967) (50% overlapping epochs of 2 s duration, linearly detrended, Hann windowed, Fourier transformed, and averaged). We then determined the frequency in the range of 8 and 13 Hz with the highest power, which was designated as the individual alpha frequency (IAF). In the case of more than one spectral peak in the 8 and 13 Hz range, we designated the IAF as the average of the peak frequencies weighted by their alpha power (Klimesch, 1999). After converting the power spectra into log-log space, low-frequency 1/f noise was estimated by fitting a straight line using points outside our targeted frequency range (0.5–7 and 35–65 Hz). SNR was defined as the difference between the amplitude at the IAF and the value of the fitted 1/f at the IAF on the log scale in units of decibels.
An additional benefit of this calculation is the ability to model individual variability in peak alpha frequency, which differs significantly across individuals (Klimesch, 1999). The original ETP algorithm used an alpha bandpass filter from 8 to 13 Hz, which assumed a peak alpha frequency of 10.5 Hz for all participants. As using individualized alpha frequencies has shown improvements in model prediction performance and stimulation effects (Klimesch et al., 2003; Thomas and Vinod, 2016), we opted to base our filters on individualized peak alpha frequencies instead of the nonspecific 10.5 Hz frequency.
We were further able to improve signal quality by using SNR and IAF as exclusion criteria. We removed all recordings with a negative SNR, specifically those recordings with a higher 1/f noise level than alpha power at the peak frequency. We also removed all recordings where no spectral peak could be determined. We removed a participant from analysis when none of their recordings passed our criteria. Of our original 578 participants, 94 were excluded, leaving 484 participants for the subsequent analyses.
Epoch extraction
We were interested in three types of epochs—resting intervals, on-task intervals, and intertrial intervals. Although we were primarily concerned with phase prediction accuracy between rest and task, we included intertrial intervals for two reasons. First, our phase prediction algorithm requires rest data to learn the average interpeak interval, but most of the task datasets did not provide pure resting-state data. Using intertrial intervals as a substitute for resting state allowed us to greatly expand the scope of our study. Second, although not part of the main analyses, intertrial intervals constitute a significant portion of on-task time and may warrant further analysis.
For each task dataset, we defined two sets of epochs. On-task epochs were extracted starting at the beginning of each trial up to the end of a trial (e.g., after a response or after the window for responding had closed). Intertrial or pseudo-rest epochs were extracted immediately after the end of a trial up to the beginning of the next trial. We used pseudo-rest epochs as the training set to learn the average interpeak interval of an individual. On-task epochs were used as the testing set to estimate EEG phase prediction accuracy. Before artifact rejection, the number of epochs in each of these sets was equal. The length of the pseudo-rest epochs ranged from 250 ms to 2 s, whereas the length of the on-task epochs ranged from 1 to 7.1 s.
For each resting-state dataset, we split the data into half, with the first half designated as the training set and the second half designated as the testing set. Each resting-state set was then split into epochs of a length of 2 s, matching the shortest on-task epoch length from our task datasets.
Statistical procedure
At each window step, the ETP algorithm returns a prediction for the timing of the next peak. We compared these predictions with the ground-truth waveform and computed their accuracies. We defined accuracy as the closeness of prediction in the ground-truth waveform to the targeted phase (0°) as follows:
Importantly, these predictions are part of a nested structure. Each peak is part of a trial, which is part of an experimental block, which is conducted on an individual, which is part of a dataset, which is separated into resting and task states. As a result of this nesting, the assumption of independence is violated, making a standard linear regression inappropriate (James et al., 2021). We used multilevel modeling with maximum-likelihood estimation to account for this hierarchical structure (Raudenbush and Bryk, 2002). A multilevel modeling approach allowed us to model the variances for hierarchical confounding variables as random effects and measure the variances of the condition of interest as fixed effects (Meteyard and Davies, 2020). Multilevel modeling also does not require sphericity, as it models heteroskedasticity as needed (Rosopa et al., 2013).
We used the following model:
We only considered random intercepts for the dataset and individual. We did not consider random slopes as each dataset and individual were part of only a rest or task dataset. Although there was a single dataset that contained both rest and task data (Wong et al., 2018), we decided to model the rest and task data as distinct datasets with different individuals. This simplification was made to promote consistency with the other datasets that only had recordings of a single type and to reduce the complexity of our linear models.
High-signal band power is critical for reliable phase estimations as higher power gives higher confidence in EEG measurements. As a result, any differences in EEG phase prediction accuracy may be partially explained by higher signal power. Our secondary analysis included occipital alpha power as a covariate to explain how cognitive domain may affect EEG predicted phase accuracy. To measure power, we took the instantaneous power computed from the Hilbert transform at the moment of prediction and included it as a covariate. We used the following intermediate model to identify the effects of rest or task on phase prediction accuracy with alpha power as a covariate:
In addition to this instantaneous measure of power, we also included the SNR over the whole recording. We used the following maximal model (Barr et al., 2013) to identify the effects of rest or task on phase prediction accuracy with alpha power and overall SNR as a covariate:
We conducted multilevel modeling in MATLAB R2020b, using the fitlme function. We categorized regression terms with a p value < 0.05 as significant.
Data availability
The data that support the findings of this study are listed in Tables 1 and 2. The table contains links to public repositories that contain the data. Only one dataset listed in the table does not have a publicly available link, which has been obtained by direct solicitation from the corresponding author. The code that supports the findings of this study are available on GitHub under the following link: https://github.com/CogNeW/project_eeg_public_dataset.
Results
Descriptive statistics of EEG signals
Across all the predictions made by the algorithm on the datasets, the average phase prediction accuracy was 73.70% (SD = 23.40%). However, accuracy only describes the spread of the predictions. In terms of phase angles, our algorithm was on average +6.30° away from the target (SD = 47.34°). In the context of a 10 Hz alpha wave, our predictions were 3.5 ms late (SD = 13.15 ms). These results align with previous attempts at phase targeting (Madsen et al., 2021; Vigué-Guix et al., 2022; Zrenner et al., 2018), whose procedures had a mean error of −12° to 5°, with SDs from 25° to 55°. The average instantaneous power was 1.91 μV/Hz (SD = 2.06 μV/Hz). The average SNR was 6.59 (SD = 4.37). Split across conditions, the eyes-closed (EC) rest state showed the highest average instantaneous band power (mean = 2.18 μV/Hz, SD = 2.20 μV/Hz), followed by the task state (mean = 1.93 μV/Hz, SD = 2.10 μV/Hz), and then the eyes-open (EO) rest state (mean = 1.57 μV/Hz, SD = 1.59 μV/Hz). Furthermore, the EC rest state had the highest average SNR (mean = 10.36 dB, SD = 4.57 dB), followed by the EO rest state (mean = 9.35 dB, SD = 4.39 dB), and then the task state (mean = 5.71 dB, SD = 3.89 dB). Complete metrics by dataset are provided in Figure 3.
Accuracy, SNR, IAF, and instantaneous power distributions across datasets. The height of each bar represents the average value, and the error bars represent 1 SD. Bars are filled based on which cognitive state they represent. Abbreviations are used for each dataset name. Table 1 contains the full dataset name and references.
Cognitive status affects EEG phase prediction accuracy
Our basic model indicated that cognitive status affected EEG phase prediction accuracy. The baseline category was EC rest, which had a phase prediction accuracy of 76.3%, 95% CI [74.9, 77.7%] The EO rest state reduced accuracy by 2.03% (t(1.51*106) = −2.10; p = 0.035) 95% CI [0.139, 3.92%]. The task state also reduced accuracy by 2.78% (t(1.51*106) = −2.91; p = 0.0036), 95% CI [0.90%, 4.65%]. A likelihood ratio test indicated that this basic model explained the data better than a null model [χ2(2) = 34842, p < 1e-5]. These results support our hypothesis that the eyes closed resting-state would have the highest prediction accuracy.
Differences in prediction accuracy remain after accounting for instantaneous power
When including just power as a covariate, the baseline accuracy of EC rest dropped to 64.94%, 95% CI [63.4%, 66.4%]. Power had a significant effect on accuracy (t(1.51*106) = 9.98; p < 1e-5). A unit increase in instantaneous power led to a 6.80% increase in accuracy, 95% CI [5.46%, 8.13%] (Fig. 3). After accounting for power, the EO rest state did not significantly affect accuracy (t(1.51*106) = −1.66; p = 0.10). However, there was a significant interaction of the EO rest state and power (t(1.51*106) = 2.17; p = 0.03), with an effect size of 2.06%, 95% CI [0.20%, 3.92%]. After accounting for power, the task state still had a significant effect on accuracy (t(1.51*106) = −2.17; p = 0.03), with an effect size of −2.27%, 95% CI [−0.22%, −4.32%]. However, there was no significant interaction between the task state and power (t(1.51*106) = 0.52; p = 0.60). These results do not align with our hypothesis that differences in phase prediction accuracy because of cognitive states do not persist after accounting for instantaneous power. A likelihood ratio test indicated that this model explained the data better than the previous model [χ2(7) = 160680, p < 1e-5].
Including SNR and power accounts for all differences among cognitive states
When including power and SNR as covariates, the baseline accuracy of EC rest dropped to 59.56%, 95% CI [57.3%, 61.9%]. The EO rest state did not significantly affect accuracy (t(1.51*106) = −0.90; p = 0.37), whereas being in a task state was also not significant (t(1.51*106) = −0.61; p = 0.54). These results align with our hypothesis that differences in phase prediction accuracy because of cognitive states do not persist after accounting for instantaneous power and SNR.
Accordingly, power had a significant effect on accuracy (t(1.51*106) = 8.47; p < 1e-5). A unit increase in instantaneous power led to a 12.23% increase in accuracy, 95% CI [9.40%, 15.06%] (Fig. 4). SNR also had a significant effect on accuracy (t(1.51*106) = 4.40; p = 0.00,001), with a unit increase in power leading to a 0.47% increase in accuracy, 95% CI [0.26%, 0.68%]. There was a significant interaction between power and SNR (t(1.51*106) = −5.29; p < 1e-5), with an effect size of −0.49%, 95% CI [−0.31%, −0.67%]. These results align with our hypothesis that most of the accuracy differences between cognitive states arose from instantaneous power and SNR. A likelihood ratio test indicated that this model explained the data better than the previous model [χ2(20) = 9182, p < 1e-5]. Follow-up contrasts indicated no further differences in phase prediction accuracy among conditions and no differences in the effects of power and SNR on phase prediction accuracy among conditions. Full regression results are shown in Table 3.
Regression results for the full model
The change in phase prediction accuracy because of a unit increase in power. The light blue curve represents an idealized EEG time series with amplitude 1, whereas the darker blue curve represents a time series with a unit increase in amplitude. The red error bars represent the average spread in peak prediction. Notice that the unit increase in power has a tighter distribution by 12.23% than the baseline time series.
Effects on accuracy are robust to variations in 1/f and SNR computation pipelines
There has recently been an increase in different tools to compute the 1/f aperiodic background activity and SNR (Donoghue et al., 2020; Kosciessa et al., 2020; Whitten et al., 2011; Zrenner et al., 2020). The current analysis has used the approach of Zrenner et al. (2020), because of its simplicity and focus on real-time applications. However, this approach may not accurately fit the background noise, particularly in the presence of noise and nonlinearities, an issue which more modern approaches have tried to address.
We have replicated our analysis pipeline but have computed SNR over the whole recording using the extended Better Oscillation (eBOSC) pipeline (Kosciessa et al., 2020). eBOSC uses a robust regression along with time-frequency analysis using Morlet wavelets. Originally intended to identify periods of neural oscillations and characterize SNR over episodes, we have made slight modifications to compute a singular SNR value over the whole recording and have used these new values in our analyses without further changes. The individual alpha frequencies from eBOSC and Zrenner et al.’s (2020) approach showed a high correlation (r = 0.61, p < 1e-5), as well as the SNR values (r = 0.73, p < 1e-5). The main results remained the same, with a significant effect of power, SNR, and their interaction on accuracy.
Effects on accuracy are robust to changes in data distribution
Unequal amounts of training data can potentially affect the prediction accuracies of the two conditions of interest. Although the linear mixed effects model is robust to unbalanced data (Schielzeth et al., 2020), the learned interpeak intervals from the ETP algorithm likely depends on the number of training samples. We ran a reanalysis, where both the number of included epochs for training and the average length of each epoch were kept consistent across all conditions. Observing that the number of tasks epochs was higher than the rest epochs, we removed task epochs uniformly across datasets, maintaining their relative contributions to the final model. Whereas we originally split resting-state data into two second epochs for both training and testing, we also decided to modify these lengths to match the averages of the task epochs. Our primary results remained the same, with a significant effect of power, SNR, and their interaction on accuracy. Additionally, there was a significant effect of power by task status by SNR (t(9.29 *105) = 2.31; p = 0.02), with an effect size of 0.31%, 95% CI [0.05%, 0.57%].
Despite matching the number of training epochs, the number of epochs used for testing was different because of additional artifact rejection criteria that was applied during testing. We performed an additional analysis where we matched the number of test epochs across the three conditions. The main results remained the same, with significant effects of power, SNR, and their interaction on accuracy. There were no additional significant results.
Discussion
The current study examined whether different cognitive states affect EEG phase prediction accuracy and whether these differences are solely attributable to EEG instantaneous band power and SNR. Our primary hypothesis that rest states would show higher accuracy than task states was supported. Our base model, which did not include instantaneous power and SNR, indicated that the eyes-closed resting state had the highest accuracy followed by the eyes-open resting state and then the task state. It is worth noting that although the accuracy of the three states differed significantly, the accuracies were still close to each other (∼76% for EC rest and ∼74% for the task state). Our secondary hypothesis that all of the changes in prediction accuracy can be accounted for by SNR and band power differences were also supported. After accounting for power and SNR, we did not observe any differences in accuracy because of cognitive state. EEG power had a large and significant effect on EEG phase prediction accuracy. SNR also significantly affected phase prediction accuracy but with a small effect size. The effects of power and SNR showed a negative interaction, where an increase in one component led to a decrease in the effect of the other. However, our models indicated that including both covariates of power and SNR was necessary to explain the differences between cognitive states as some differences remained in prediction accuracy because of cognitive state when including only power.
We observed some unexpected patterns in band power between conditions in our analysis. Although the EC rest state had the highest average alpha power and highest accuracy, the EO rest state showed a lower average alpha power than the task states but a higher overall accuracy. This discrepancy is partially accounted for by the fact that the SNR of the EO rest state was almost double that of the task state. The higher SNR of the EO rest state also likely contributed to the significant interaction effect of the EO rest state with power in the intermediate model, which did not include SNR. The lower average alpha power of EO rest datasets was not isolated to a single dataset but was characteristic of the group; the three datasets with the lowest average alpha power were all EO rest datasets. It is counterintuitive that the task state had higher alpha power than the EO resting state, directly contradicting a previous study (Li, 2010) as well as prior findings that external attention should attenuate occipital alpha (van Dijk et al., 2008; Foxe and Snyder, 2011; Romei et al., 2010). Although it is true that external attention may attenuate alpha, there are several explanations why alpha power was higher in our task datasets. Participants may have experienced lapses of attention during parts of the task, which are associated with higher alpha power (Driel et al., 2012; Macdonald et al., 2011; Romei et al., 2010), or unmodeled environmental distractors and individual differences may have contributed to differing alpha powers (Aurlien et al., 2004; Hopman et al., 2020). Nevertheless, the resting-state datasets overall showed higher phase prediction accuracy than the task states. These differences were accounted for by band pass power and SNR; the differences in accuracy among conditions disappeared when including these covariates.
The rest states had almost double the SNR in the alpha band when compared with the task states. One potential explanation is that these increases in SNR were because of specific increases in alpha band power during the resting state. However, this reasoning would only apply to our eyes-closed resting-state data as the task state had higher average instantaneous power than the eyes-open resting state. Another explanation is that our SNR changes were because of changes in the overall power spectra, caused not just by the alpha band but by neighboring frequencies as well (Klimesch, 2018). For instance, Pathania et al. (2021) observed a steepening of the power spectra on task compared with rest. There is emerging evidence that the pattern in the power spectra depends on interindividual differences, task demands, and cognitive state (He, 2014; Henrie and Shapley, 2005; Klimesch, 2018; Miller et al., 2012). As a result, it is possible that the task datasets we analyzed revealed not only local increases in alpha, but also increases in other frequency bands that would have decreased the alpha SNR, yet still led to higher average instantaneous alpha power.
We also found a significant negative interaction between instantaneous alpha power and SNR, with power having a smaller effect on accuracy when SNR was higher. One explanation for this relationship could be a ceiling effect, where high SNR recordings already have a high baseline of accuracy, lessening the effect of power. The positive effects of power and SNR on accuracy, as well as the negative interaction between power and SNR, align with a prior investigation by Zrenner et al. (2020) into the practical limits of phase estimation methods. In their work, however, SNR was calculated on a per-epoch basis, whereas we calculated SNR on a per-recording basis. Consequently, our results show more variability for the same SNR value; even in low SNR recordings during periods of low alpha power, we saw some predictions yielding high accuracies. This result is likely if the prediction was in a high SNR epoch of a low SNR recording. Nevertheless, when averaged over the whole recording, our SNR effect sizes and directions align with prior results.
In one of our analyses where we equalized the number of training epochs used across conditions, we noticed a significant power by task by SNR interaction. We believe that this interaction is unique to the subset that we have chosen as we removed task epochs to match the number of rest epochs. We found this interaction only in this specific analysis and not in the other three analyses we have done. Two of the analyses were performed with approximately three times more data, indicating that this interaction likely does not generalize for larger samples.
Our results suggest that phase peaks can be consistently predicted across cognitive domains by targeting periods of high power and SNR. Experimenters do not need to induce a particular cognitive state to have high accuracy, at least in the case of occipital alpha, if they are willing to wait for periods of high power and SNR. The importance of signal power and quality is evidenced by the strong effect size of power and the general similarity of phase prediction accuracies between cognitive state conditions. The difference in the average phase prediction accuracy between our best and worst performing dataset was 5.75%, corresponding to a 10.35° deviation in phase. This difference is easily counteracted by waiting for periods of high power as a unit increase in power increases accuracy by 13.12%. Waiting for an appropriate period of high power should not take too long; the difference between the median and upper quartile of power in our datasets was over a unit increase in power, which should equal an average wait time of a few hundred milliseconds for an alpha wave.
These results suggest that phase–behavior studies should systematically consider the effects of power and SNR in their models. Although many current closed-loop BCI models do include a power criterion for stimulus presentation and stimulation (Vigué-Guix et al., 2022; Zrenner et al., 2018), their criteria are often different. For instance, Vigué-Guix et al. (2022) only considered windows that had an amplitude 30% above the median amplitude in the past 10 s, whereas Zrenner et al. (2018) manually adjusted amplitude thresholds to maintain a consistent firing rate. Finalized amplitude thresholds should be reported in all closed-loop studies to encourage replicability, a key issue in electrophysiological brain-behavior studies (Bigdely-Shamlo et al., 2015; Cohen, 2017). Finalized amplitude thresholds should be reported in all closed-loop studies to encourage replicability, a key issue in electrophysiological brain-behavior studies. Our current results suggest that a single amplitude threshold could suffice across cognitive domains as accuracy values across cognitive domains are similar at baseline and also because of a lack of an interaction between power and cognitive domain. In addition to closed-loop studies, open-loop studies should consider including power as a covariate. Most current phase–behavior studies use a measure of intertrial coherence (VanRullen, 2016), which does not depend on power. As EEG phase and power are intimately related (Zrenner et al., 2020), the use of amplitude-dependent measures of phase coherence may be more appropriate (Yoshinaga et al., 2020).
The current study used the ETP algorithm, which is notable for its efficiency in real-time contexts (Shirinpour et al., 2020), but the results could be expected to extend to other linear prediction techniques as the ETP algorithm shows similar performance to Fourier-based and autoregressive approaches (Mansouri et al., 2017; Shirinpour et al., 2020; Zrenner et al., 2018). Even in the current article, the phase prediction accuracy (mean offset of +6.30°, with SD of 47.34°) is comparable with previous attempts at phase targeting (Madsen et al., 2021; Vigué-Guix et al., 2022; Zrenner et al., 2020). It is likely that our results also extend to single-layered machine learning (ML) models of EEG phase prediction (McIntosh and Sajda, 2020), which are analogous to a linear regression. Interestingly, McIntosh and Sajda (2020) showed that these single-layered ML models performed similarly to more complex multilayered and gated-recurrent unit ML models. Nevertheless, investigation into the generalizability of our results to alternative algorithms is warranted. Of particular interest are dynamical systems approaches, such as state-space modeling (Matsuda and Komaki, 2017; Wodeyar et al., 2021), which have been shown to better track phase in situations involving broadband rhythms and phase resets.
We made several changes to the ETP algorithm to support epoched data analysis, which may affect the internal and external validity of some comparisons. Although Shirinpour et al. (2020) did not consider using epoched datasets for training, the training set was meant to be a representative sample from which to learn stable properties of the EEG waveform. Therefore, we do not expect much performance difference from using epoched than continuous data; although epoching may add more boundaries in the EEG recording and accordingly more edge filtering artifacts, the total length of data captured was like that of a continuous recording, and the ETP algorithm made sure to ignore samples near the edges. Another significant difference in the ETP implementation was that although the original algorithm learned the interpeak interval using pure resting-state recordings, we used intertrial intervals for the task states to learn interpeak intervals, as pure resting-state data were unavailable for many of our datasets. Significant neurocognitive adjustments may occur during intertrial intervals (Compton et al., 2011; King et al., 2010), potentially making the learned average interpeak interval inappropriate or at least not comparable with true rest. However, an exploratory analysis showed that training on these intertrial intervals provided the highest accuracy when compared with training on pure resting-state conditions, possibly because of shifts in alpha peak frequency that occur during different cognitive states (Mierau et al., 2017). Shirinpour et al.’s, (2020) study only included resting-state recordings in training and prediction; it is possible that training on the most similar cognitive state used for prediction would yield the highest accuracy, whether it is the resting state or the task state.
The fact that we can use intertrial intervals for training is significant. This interchangeability further emphasizes the stability of EEG phase across conditions. The results also suggest that phase-dependent BCI classifiers and other closed-loop machines can be effectively trained on intertrial intervals, in addition to distinct training sessions and rest periods. Including intertrial intervals could increase the amount of training data and statistical power available to BCI and research applications, as well as allow the phase-prediction model to adjust to any dynamic state changes throughout a recording.
Although we used datasets spanning multiple laboratories and countries, there was an imbalance among the represented conditions with task datasets making up 79.7% of all predictions. Also, certain datasets made up a large percentage of their group; one dataset made up 47.1% of the eyes-closed dataset. Future analyses can not only add more datasets but also make further refinements in dataset classification, specifying periods in a recording when different cognitive systems will be engaged, such as attention and memory. Researchers can also examine rest and task differences in a data-driven fashion, identifying power and topographic differences in different cognitive states across multiple frequency bands. A more hypothesis-driven approach is to use individualized spatial filters on hypothesized generators to get more accurate signals and track their movements in real time (Gordon et al., 2021). A particularly interesting covariate to include is the amount of nonstationarity in the EEG signal (Cao and Slobounov, 2011; Das and Nason, 2016), which makes signals harder to predict through nonsinusoidal distortions and phase resetting (Matsuda and Komaki, 2017; Wodeyar et al., 2021). Using these phase-prediction models, we could identify good and bad periods of phase prediction accuracy, giving greater insight into the cognitive and neurobiological correlates of EEG phase. Future studies, in both research and clinical settings, could use these results to optimize EEG phase prediction accuracy in closed-loop and BCI implementations to administer more effective interventions and stimuli.
Conclusion
Our results indicate that existing methods can track EEG phase accurately across various cognitive conditions and datasets and that accuracy can be enhanced by waiting for high periods of instantaneous band power and SNR. Our base model demonstrated that cognitive state (eyes closed, eyes open, task) affects EEG phase prediction accuracy, with the eyes-closed resting state being the most accurate, followed by the eyes-open resting state, then by the task state. Nevertheless, the absolute accuracy differences were relatively small and attributable to EEG power and SNR. Accordingly, experiments, closed-loop technologies, and BCIs implementing real-time EEG phase prediction should use protocols that minimize the influence of external unwanted noise and target periods of high power for maximum accuracy as opposed to manipulating experimental and cognitive conditions. We also showed that phase prediction models can be trained on intertrial intervals for prediction during on-task periods. Future studies and BCI implementations may benefit from using a similar approach to obtain a higher number of trials and a more dynamic model. Additional research into other frequency bands and sites of interests and covariates such as nonstationarities are warranted to understand the cognitive and neurobiological correlates of EEG phase and how they inform the optimization of EEG phase prediction accuracy in closed-loop applications and BCIs.
Acknowledgments
Acknowledgment: We thank the National Institutes of Health for Grant R01NS121219-01 to F.V. and J.D.M. and Starfish Neuroscience for philanthropic donations to F.V. and J.D.M.
Footnotes
The authors declare no competing financial interests.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.