## Abstract

Rhythms are a common feature of brain activity. Across different types of rhythms, the phase has been proposed to have functional consequences, thus requiring its accurate specification from noisy data. Phase is conventionally specified using techniques that presume a frequency band-limited rhythm. However, in practice, observed brain rhythms are typically nonsinusoidal and amplitude modulated. How these features impact methods to estimate phase remains unclear. To address this, we consider three phase estimation methods, each with different underlying assumptions about the rhythm. We apply these methods to rhythms simulated with different generative mechanisms and demonstrate inconsistency in phase estimates across the different methods. We propose two improvements to the practice of phase estimation: (1) estimating confidence in the phase estimate, and (2) examining the consistency of phase estimates between two (or more) methods.

## Significance Statement

Rhythms in the brain can coordinate the activity of individual neurons and communication within brain networks, making rhythms a target for therapeutic interventions. Brain rhythms manifest in diverse ways, appearing with sinusoidal and nonsinusoidal waveforms, multiple peak frequencies, and variable durations. Across this diversity of rhythms, an important feature to characterize is the rhythm’s phase. In this manuscript, we demonstrate the ambiguity inherent in estimating phase from neural data. We propose estimating uncertainty in the phase and comparing multiple phase estimators improves phase estimation in practice.

## Introduction

A rhythm in the local field potential (LFP), magneto/electroencephalogram (M/EEG), or intracranial EEG (iEEG) is a waveform repeating without overlap. A cycle of a rhythm is an expression of the waveform (Cole and Voytek, 2017). Once the waveform is defined the phase can be used as a local time index of a cycle relative to the waveform. The phase of rhythms may have a role to play in perception (Busch et al., 2009; VanRullen, 2016; Fiebelkorn et al., 2018; Helfrich et al., 2018; Herbst et al., 2022), communication across neural populations (Roberts et al., 2013; Fries, 2015; Wodeyar and Srinivasan, 2022), coordination within a neural population (Canolty and Knight, 2010) and coordination between local field potentials and neural spikes (Jarvis and Mitra, 2001; Ray et al., 2008; Lepage et al., 2011; Zanos et al., 2012; Pesaran et al., 2018). Phase has been of critical importance for therapeutic interventions as well (Widge and Miller, 2019), e.g., acoustic stimulation during sleep can induce slow waves (Ngo et al., 2013) and deep-brain stimulation can reduce Parkinsonian tremor (Cagnan et al., 2017).

In practice, phase is estimated with a Fourier analysis on tapered versions of the time series (Lepage et al., 2013), such as by using a Hilbert transform. In this approach, we assume that the time series consists of a stable rhythm accurately described as a harmonic (single frequency) or a narrowband waveform. A truly narrowband signal has spectral power concentrated in a tight frequency band. Any cycle will have an average rate of change of phase equal to the center frequency of the frequency band of maximum spectral power. This assumption determines the waveform and the phase. However, specifying phase for a cycle is difficult when the rhythm is inseparable from noise, the waveform has nonsinusoidal properties (Jones, 2016), or there is distortion of the waveform (such as from filtering a time series; Yael et al., 2018).

Recent evidence suggests that a narrowband model for neural rhythms may be inaccurate. For example, visual cortex γ band activity (30–90 Hz, cf. Buzsaki, 2006) is amplitude modulated and lacks narrowband periodicity (Burns et al., 2010; Xing et al., 2012; Spyropoulos et al., 2022). In this case, stochastically-driven damped harmonic oscillators may better model the γ activity. The discretized version of such a model is an autoregressive process of order two (abbreviated as AR(2); see Kramer, 2022, their Appendix A). Spyropoulos et al. (2022) show that an AR model fits the broadband power spectrum of the γ rhythm better than a narrowband model. AR models for both noninvasive (Franaszczuk and Blinowska, 1985) and invasive (Nozari et al., 2020) electroencephalogram recordings support a broadband representation for brain rhythms. In addition, several studies suggest that brain rhythms can consist of nonsinusoidal waveforms (Ouedraogo, 2016; Cole and Voytek, 2017; Bernardi et al., 2019), exhibit multiple nearby peak frequencies (Tort et al., 2010), and persist for short periods of time (Jones, 2016; van Ede et al., 2018). These observations suggest brain rhythms may be broadband, and in these cases narrowband models lead to phase estimates which do not capture what is intuitively sought by the practitioner.

Phase is relatively defined, and how we estimate phase depends on the interpretation sought by the practitioner. For any simulated time series without an explicit phase parameter, phase must be independently estimated, thus there is no “true phase.” In this manuscript, we consider how different methods define phase when each is applied to simulated signals generated from different stochastic processes, each time series lacking an explicit phase parameter. Through this analysis, we identify moments when phase estimates agree and disagree across methods and examine how each method performs in different simulation scenarios. We consider three distinct phase estimators: (1) the Hilbert transform of a narrowband filtered time series, (2) a state space model of rhythms, and (3) the Poincaré section. We apply each of these methods to three simulated time series: (1) a broadband rhythm in pink noise, (2) an AR(2) process, and (3) a Fitzhugh–Nagumo oscillator. Comparing the cross-method consistency in phase, we show how uncertainty metrics can be used to improve consistency; when any method shows confidence in the phase estimate, the different methods tend to agree. Finally, we illustrate example applications of these methods to estimate coherence and phase-amplitude coupling in *in vivo* data. We show how estimating uncertainty in the phase helps clarify the link between phase and other quantities of interest. Understanding the role of phase in neural dynamics requires us to be aware that phase is not unique and phase estimates may vary depending on the method employed.

## Materials and Methods

### Overview of phase estimation approaches

We apply three methods to estimate phase, each of which makes different assumptions for the rhythm model. The first method we consider consists of bandpass filtering the data and then estimating the phase via the Hilbert transform. While a common method in neuroscience applications (Canolty and Knight, 2010; Zhong et al., 2017; Navas-Olive et al., 2020; Petersen and Buzsáki, 2020), this approach is sensitive to noise and susceptible to waveform distortion. The second method we consider is a state space model with the state, representing the amplitude and phase, modeled by stochastically-driven oscillators. The state space model makes stronger assumptions of the data but is more resistant to model noise when the model class optimally explains the periodicity in the data. The final method we consider is the application of the Poincaré section, which is the most general method considered but is also the most sensitive to noise and user-defined choices. A complication is that different models which fit a brain rhythm equivalently well can still give phase estimates tracking distinct features of a rhythm. In what follows, we use this ambiguity when specifying phase to explore approaches to validate analyses involving phase.

### Finite impulse response filter and Hilbert transform (FIR-Hilbert) method

Fourier-transform based analysis, a wavelet-transform, or a Hilbert transform employ essentially the same procedure (Bruns, 2004) to estimate the phase but with different tapering functions. Critically, all these approaches demarcate a set of frequencies (bandwidth of the narrowband) as relevant to capture the neural signal of interest and a set of frequencies as being irrelevant (for example, filtering to select the α band, defined as 8–13 Hz). This demarcation is intended to separate the signal and noise without relying on a parameter to explicitly model the noise contribution (Donoghue et al., 2020). Given the essential equivalence of the windowed-Fourier, wavelet, and Hilbert transforms (Bruns, 2004), we focus here on the Hilbert transform, i.e., we apply a filter to the data time series to select a frequency band of interest, apply the Hilbert transform, and derive the instantaneous phase (Chavez et al., 2006).

For an interpretable instantaneous phase, where the phase captures the waveform periodicity as opposed to amplitude modulation, the Hilbert transform (or a windowed-Fourier transform or a wavelet-transform) should only be applied when the spectral support (i.e., band for which spectral mass is non-zero) of the amplitude time series does not overlap with the spectral support of the phase time series (Boashash, 1991; Chavez et al., 2006). This is a consequence of the Bedrosian theorem (Bedrosian, 1963; for discussion, see Boashash, 1991) which shows:

Another factor affecting the interpretability of the phase under the Hilbert transform is the parameter set used for the narrowband filter. A practitioner must decide the filter type, filter order, and window duration (for a discussion, see Widmann et al., 2015; for an analysis of different parameter choices, see Zrenner et al., 2020). The practitioner must also choose the filter’s bandwidth; any waveform whose spectral support is wider than the bandwidth of the filter will be altered in the filtered time series.

For all analyses here, unless otherwise stated, we apply a least-squares linear-phase FIR filter of order three times the longest cycle length expected (as in Cohen, 2008; Nadalin et al., 2019). After forward and backward filtering of the data (using the *filtfilt* function in MATLAB), we apply the Hilbert transform to the data to estimate the quadrature component of the signal. We compute the analytic signal using the *hilbert* function in MATLAB, which computes the quadrature component (the imaginary term) and combines it with the in-phase component (the real term). Finally, we estimate phase using the four-quadrant arctan function implemented by the *angle* function in MATLAB.

We estimate the 99% confidence interval for the FIR-Hilbert phase estimator, assuming stationary noise and complex-valued normality for the additive noise to the analytic signal, using Lepage et al., 2013 (Eq. 3.18):

We note that the confidence interval for the phase estimate depends on the filter bandwidth (*B*): the wider the bandwidth, the greater the variance (or lower the confidence) in the phase estimate (Eq. 2). Similarly, a filter with center frequency misaligned with the center frequency of the true signal will increase variance in the phase estimate by increasing the additive noise variance (

### State space model

A state space modeling approach to specify phase (Matsuda and Komaki, 2017; Wodeyar et al., 2021; Soulat et al., 2022) expects the random time series to exhibit periodicity similar to that of a vector-valued AR(2) process. We assume that the analytic signal of a rhythm is a latent, unobserved process, that we can estimate from the observed data. We build this assumption into a state space structure, with the state equation (see Eq. 3 below) representing the latent analytic signal and the observation equation (see Eq. 5 below) representing the observed data. On receipt of each new sample of the observed data, we correct the estimate of the state using a Kalman filter, an optimal strategy when the state update equation is linear and Gaussian (see Schiff, 2012, Chapter 2).

To illustrate this approach, we consider the case of tracking a single rhythm whose central baseband frequency is 6 Hz. The state equation of the univariate analytic signal
x for this case is:

When applying this method to identify brain rhythms and track the associated analytic signal, we need to estimate
Q,
a,

### Poincaré section method

For any K-dimensional dynamical system and associated particle trajectory, we can define a Poincaré section as a K-1 dimensional plane which intersects the considered trajectory. We specify phase relative to the moment of plane crossing. For example, in a one-dimensional system (i.e., the activity varies along a single axis), negative to positive zero-crossings define a Poincaré section, where the origin is the ordinate through which we mark the particle trajectory. As described by Chavez et al. (2006) and others (Rosenblum et al., 1997; Kralemann et al., 2007), phase specified using a Poincaré section depends on the selection of observables (for example, the signal alone vs a time-delayed embedding; Takens, 1981), and how the section is defined. In this manuscript, we consider K = 2 and select zero-crossings to define the phase. From one zero-crossing (i.e., from the voltage traveling from negative to positive values) to the next, 2π radians are uniformly covered by the evolution of the phase. We denote *n*^{th} and *n + 1*^{th} zero-crossings of the time-evolving particle. During [

For a pair of consecutive zero-crossing times, the Poincaré section assigns a phase of 0 at *unwrap* function in MATLAB) assigned by the Poincaré section, add π/2, and wrap (*wrapToPi* in MATLAB) the phase to [−π,π]. This transformation shifts the phase by π/2 and remaps phase to [−π,π]. Unlike the other two methods considered here, the Poincaré section does not explicitly model a source of noise and thus lacks a measure of uncertainty in the phase estimate. Despite this and as we will see in Results, for nonsinusoidal rhythms, the Poincaré section serves as a useful comparison.

### Signal-to-noise ratio

We define the signal-to-noise ratio (SNR) as the extent to which the power within a limited band of frequencies around the central frequency is greater than power at other frequencies. We control SNR in our simulations to exceed 2.5 for clear visual identifiability of a rhythm. To infer the SNR for an oscillation in the data with center frequency in
F, we apply:
*mtspectrumc* function from Chronux (Bokil et al., 2010). In this manuscript, the summation in the numerator includes frequencies from 1 Hz below to 1 Hz above the central frequency, and the summation in the denominator includes frequencies 2 Hz below and 2 Hz above the central frequency in the denominator.

### Estimating phase difference variability

To determine the consistency of phase across methods, we compute the circular SD (Zrenner et al., 2020). This method characterizes the width of the distribution of phase differences between a pair of phase estimators around a mean difference in the phase. To measure the circular SD of the phase differences between phases *M* and *N*, we apply (Zrenner et al., 2020),

We note that this method of assessing phase across methods presumes that any bias (i.e., a phase difference centered at a value other than zero) is consistent across time (Zrenner et al., 2020). We interpret the

### Analysis of *in vivo* data

To illustrate application of the phase estimation methods, we consider an open-source dataset consisting of recordings from two invasive EEG electrodes (iEEG, 400 s sampled at 1000 Hz) in a nonhuman primate anesthetized with propofol (available at http://neurotycho.org/expdatalist/listview?task=75). Propofol anesthesia yields a brain state with synchronized brain rhythms (Lewis et al., 2012), resulting in strong cross-cortical coherence and phase-amplitude coupling relationships, phenomena proposed to coordinate neural spiking within and between brain regions (Jarvis and Mitra, 2001; Lepage et al., 2011; Fries, 2015; Pesaran et al., 2018). Here, we apply different phase estimation methods to analyze the coherence in the slow δ band [0.25,1.5 Hz] between the two recordings, and phase-amplitude coupling between the slow δ and β [13,30] Hz (defined from the classical range provided by Engel and Fries, 2010) bands within each recording (Ma et al., 2019).

To estimate the phase of slow δ activity via the FIR-Hilbert method, we apply an FIR filter to isolate the frequency band of interest (slow δ) and calculate the Hilbert transform to estimate the slow δ phase. We apply the same FIR-Hilbert approach to estimate the amplitude time series for the β band (Cohen, 2008; Nadalin et al., 2019). For the state space method, we first estimate model parameters as well as the central frequencies for each band (slow δ and β) using a subset of the data (10 s). We do so for each recording separately, and then use these parameters for state estimation across all 400 s of data (see Wodeyar et al., 2021, Equations 6–10). For the Poincaré method, we first broadband filter the iEEG between 0.1 and 50 Hz using a fourth order Butterworth filter. Filtering before applying the Poincaré method reduces the influence of higher frequency noise and slow drifts on zero-crossings. The Poincaré method does not yield an amplitude time series, so we assume the amplitude is constant and equal to 1. We note that, in this case, coherence values using the Poincaré method are akin to estimating a phase locking value (Lachaux et al., 1999).

We use phase estimates from each method to estimate the coherence of the slow δ [0.25,1.5 Hz] rhythm between the two recordings. Coherence is an estimate of the consistency in the phase and amplitude at one frequency (or in a single frequency band) between two electrophysiological time series (for a discussion see Pesaran et al., 2018). We compute coherence (
C) for two complex-valued row vectors over time

Phase-amplitude coupling represents how the phase of a low frequency rhythm (here, the slow δ band) coordinates the amplitude of a high frequency rhythm (here, the β band; Hyafil et al., 2015). We note that we cannot apply the Poincaré method for phase-amplitude coupling estimation as this method does not estimate an amplitude time series. We therefore analyze the phase-amplitude coupling within each recording using only the FIR-Hilbert and state space methods. To visualize the phase-amplitude coupling, we divide the phase between 0 to

### Code accessibility

All code developed for and applied in this paper are available for reuse and further development at https://github.com/Eden-Kramer-Lab/UncertaintyInPhase. All code was run on a Dell XPS 15 7590 running Windows 10 and having 32 GB of RAM. A README file is included with the code that has a legend for all files.

## Results

A pure sinusoid is the only function that has an explicitly specified phase parameter, but it is too restrictive a model for most brain rhythms (Cole and Voytek, 2017). For the simulated time series considered below, an optimal specification of phase does not exist. The generative models used to simulate the time series have no explicit phase parameter, thus, phase estimates must be derived as functions of the generative stochastic processes. This precludes a “true phase,” i.e., a single-phase parameter, and error metric. Because in practice only observations of a pure sinusoid have a true phase, we instead consider different methods to estimate phase, each imposing different assumptions about the data. We then compare the relative performance of these different methods. In what follows, we show that when any method indicates reduced uncertainty in the phase estimate, then all three methods tend to agree allowing practitioners to limit the phase estimates required to be considered.

We analyze simulation data from four generative models. First, we examine a case where the generative model has a phase parameter but is unlikely to be a good model of empirical data. Then, we consider three more empirically valid models: a nonparametric model, a simple stochastic parametric model, and a dynamical systems model. The first empirically valid generative model can fit the broadest range of neural data, while the second and third generative models are relevant for smaller subsets of data. Each generative model serves as a potential exemplar of a range of empirical data. For all generative models, we make three assumptions. First, we assume that a rhythm always exists in the data with high signal-to-noise ratio (SNR; with SNR ≥ 2.5 to allow for an identifiable rhythm, see Materials and Methods). Second, we assume that the data consists of a single rhythm of interest. Third, we assume that any temporal variability in the data is derived from the generative model.

### Amplitude modulated sinusoid in pink noise

To consider the scenario of a well-defined, ground truth phase, we simulate a 6 Hz sinusoid with amplitude modulation at a rate of 0.56 Hz, such that every 1.8 s, the 6-Hz sinusoid emerges for 0.9 s (see spectrum to understand the signal-to-noise level in Fig. 1*a* and example trace in Fig. 1*b*). In this signal, a meaningful phase parameter exists only during times when the sinusoid is present. To this signal, we add pink noise (power proportional to 1/f^{1.5}; Fig. 1*a*) with random starting phases for the sinusoid at each frequency bin. We analyze 1000 instantiations of these simulated data, each of duration 10 s. This signal represents the burst-like dynamics of rhythms in brain electrophysiological time series.

We estimate the phase of the simulated 6 Hz rhythm using the Poincaré method, the FIR-Hilbert method, and the state space method (examples in Fig. 1*b*). Visual inspection suggests that the latter two methods produce similar phase estimates when the sinusoid is present. However, during periods when the sinusoid is not present (Fig. 1*b*, red rectangular box), the FIR-Hilbert method computes phase estimates whose trajectory completes full cycles (continuous traversal from *c*); the state space and FIR-Hilbert methods more accurately (median circular SD of 20° and 15°, respectively) track the true phase compared with the Poincaré method (median circular SD of 72°).

The state space and FIR-Hilbert methods both support the calculation of metrics of uncertainty in the phase estimate, credible intervals for the state space method and confidence intervals for the FIR-Hilbert method (see Materials and Methods). To compare these metrics of uncertainty, we calculate the distributions of the widths of the credible and confidence intervals for time points when the sinusoid is, and is not, present (Fig. 1*d*). We find that both measures of uncertainty unambiguously distinguish when a sinusoid is present (the uncertainty is low, FIR-Hilbert median confidence interval width of 10°; and state space method median credible interval width of 13°) versus when a sinusoid is absent (the uncertainty is high, FIR-Hilbert median confidence interval width of 54° and state space method median credible interval width of 51°).

Finally, as we lack knowledge of the true phase in real data and in more realistic simulations of neural activity, we compare the phase estimates between methods as a proxy to judge when multiple characteristics of the signal used to define phase are consistent. Comparisons between all three methods and their metrics of uncertainty reveal when meaningful estimates of phase can be calculated, i.e., when the sinusoid is present. For each pair of methods, we compute the difference between phases and construct the circular SD histogram of the phase estimate differences across time points when the sinusoid is present and when it is absent. We find that when the sinusoid is absent there is a greater cross-method difference in phase estimates, consistent with greater discrepancy in how the phase is defined. We conclude that for an amplitude modulated sinusoid the state space and FIR-Hilbert methods more accurately recover the true phase compared with the Poincaré method.

### Broadband rhythm in pink noise

To match empirical observations in neuroscience (Miller et al., 2009a,b), we consider a phenomenological model of rhythms with increased power over a broad range of frequencies, emerging above 1/f^{α} background activity (i.e., the background activity power decreases with frequency as 1/f^{α}, where ^{1.5}; Fig. 2*a*) with random phases at each frequency. The inverse discrete Fourier transform applied to this complex-valued vector yields a time series of 10 s in duration.

We estimate the phase of the simulated 6 Hz rhythm using the Poincaré method, the FIR-Hilbert method, and the state space method (examples in Fig. 2*b*). Visual inspection suggests that the latter two methods produce similar phase estimates when the rhythm is visible; at these times the credible intervals of the state space method tend to overlap the confidence bounds of the FIR-Hilbert method (i.e., the two shaded curves tend to overlap in Fig. 2*b*, see *t*
≈ 1.5 s). During periods of low amplitude (*c*); phase differences between the state space and FIR-Hilbert methods concentrate near 0, while phase differences with the Poincaré method appear at all angles. Repeating this analysis for 1000 realizations of the generating model and considering the variability of phase differences using circular SD (see Materials and Methods, Estimating phase difference variability), we find consistent results (Fig. 2*d*, left boxplots).

We infer moments of consistent phase using the confidence bounds determined from the FIR-Hilbert method and the credible intervals from the state space method. We designate time segments with both smaller confidence bounds and smaller credible intervals as “high confidence” time segments. These are times when the values are jointly in the bottom quartile (as empirically determined) for both uncertainty metrics. Across high confidence time segments, the phase difference between the FIR-Hilbert and state space methods shows a circular SD of 14.22 ± 0.07° (mean ± SEM) reduced from 30.02 ± 0.13° when considering all data (i.e., times with and without high confidence in the phase estimates). Similarly, the circular SD between the FIR-Hilbert/Poincaré (44.8 ± 0.3°) and state space/Poincaré (45.8 ± 0.3°) methods decreases in high confidence time segments, compared with all data (FIR-Hilbert/Poincaré: 69.5 ± 0.2°; state space/Poincaré: 68.6 ± 0.2°; Fig. 2*d*, right boxplots). However, we note that circular SD remains high when comparing the Poincaré method for both choices of time segments (Fig. 2*d*, red and pink boxplots).

We note that different filtering approaches may impact the FIR-Hilbert results. To investigate this, we also applied to these simulated data a variety of common filtering approaches (as in Zrenner et al., 2020): windowed sinc and least squares finite impulse response filters of second to fifth orders; Butterworth filters of orders 2, 4, and 6; Chebyshev Type I filters of orders 2, 4, and 6; and elliptical filters with 10- and 20-dB attenuations. We maintained a consistent passband across approaches to allow comparability and applied the Hilbert transform to estimate the phase. For each filtering approach, we then computed the phase difference with the primary FIR-Hilbert approach defined in Materials and Methods. Different filtering approaches with the same central frequency and bandwidth nonetheless produce different phase estimates, as expected (Fig. 3*a*). However, the circular SDs between estimates from different filtering approaches are much smaller (median is <20° in 15 out of 16 cases) than the deviation in phase estimates from the FIR-Hilbert method and either of the state space or Poincaré methods (Fig. 3*bi*). Because the filtering approaches focus on capturing a narrowband peak in the simulated data, we find that different filtering methods produce similar phase estimates, consistent with Sameni and Seraj (2017) and Zrenner et al. (2020). Finally, we examine the impact of the center filter frequency on the FIR-Hilbert method. To do so, we adjust the center filter frequency between 3 and 9 Hz, while keeping the bandwidth (4 Hz) and filter approach (finite impulse response) fixed. We find that, as the central frequency of the filter deviates from the correct central frequency, phase differences compared with the primary FIR-Hilbert approach rapidly increase (e.g., the phase difference between the 3 Hz filter and 6 Hz filter exceeds that between the FIR-Hilbert and Poincaré methods; Fig. 3*ci*). This difference is reduced (by ∼20°) when examining the subset of data with confident phase estimates (i.e., when the FIR-Hilbert confidence intervals are small; Fig. 3*cii*). We conclude that during periods of high confidence in the phase estimate (e.g., when the rhythm has higher amplitude, reducing the confidence interval width), the phase is consistently estimated (similar phase estimates across filters) even when the center filter frequency deviates from the true central frequency.

We conclude that, for a broadband rhythm in pink noise, the state space and FIR-Hilbert methods produce comparable results. High frequency fluctuations in the time series produce intervals of rapid zero-crossings in the Poincaré method. Intervals exist with increased agreement between all three methods; these intervals occur when the state space and FIR-Hilbert methods produce confident estimates (i.e., small confidence bounds/credible intervals).

### Noise-driven damped harmonic oscillator

Auto-regressive (AR) processes of order 2 are linear time-series models capable of exhibiting periodicity (Shumway and Stoffer, 2017). AR(2) processes sustaining a rhythm demonstrate spontaneous variability in the amplitude of the rhythm or amplitude modulation (Spyropoulos et al., 2022) and a 1/f^{α} proportional power spectrum. These features are consistent with features that characterize γ band rhythms, as inferred from *in vivo* observations (Burns et al., 2010; Xing et al., 2012; Spyropoulos et al., 2022). Unlike the previous simulation where the additive noise is characterized by a 1/f^{α} power spectrum, here the Gaussian noise is passed through an AR(2) filter to yield spectra that has both a broadband peak centered at 6 Hz and a 1/f^{α} power spectrum; see analytic power spectrum in Figure 4*a* and an example trace in Figure 4*b*.

Visual inspection of the estimated AR(2) phase reveals intervals of agreement and disagreement between the three methods (Fig. 4*b*) as a result of amplitude modulation. For example, when the rhythm has high amplitude (*t *<* *0.5 s), the uncertainty reduces and the three phase estimates are qualitatively similar in the features (e.g., peaks, troughs, and zero-crossings) tracked by the phase. At other times the methods disagree about what constitutes a full phase cycle. For example, at *t* near 1.5 s, the observed time series exhibits two low amplitude troughs. Between these troughs, the state space model suggests no phase cycle occurs (staying near *c*, top row), with a circular SD of 37.9 ± 0.1° (mean and SEM across comparisons of all three methods over 1000 realizations of the simulation; see Fig. 4*d*). Restricting analysis to high confidence time segments, we find increased consistency in phase estimates across methods (Fig. 4*c*, bottom row) and a nearly twofold reduction in the circular SDs between methods (22.9 ± 0.1°, 1000 realizations of the simulation; see Fig. 4*d*).

We conclude that, for an AR(2) simulated time series, all three methods produce comparable results. Compared with the simulation of a broadband rhythm in pink noise (Fig. 2), in the AR(2) simulated time series fewer high frequency fluctuations distort the phase estimates under the Poincaré method. Consistent with the previous simulation, when the state space method and FIR-Hilbert methods both produce confident estimates (i.e., small confidence bounds/credible intervals), the agreement across methods improves. Despite this general agreement between methods, an important difference exists; when the rhythm amplitude is small, the state space method produces nonevolving phase estimates (e.g., near *b*), while phase estimates continue to evolve for the other methods.

### Fitzhugh–Nagumo oscillator

As a third example of a generative model, we consider the Fitzhugh–Nagumo oscillator, an approximation of Hodgkin-Huxley neuronal membrane potential dynamics (Izhikevich and FitzHugh, 2006). The model is defined by the following equations where (in normalized units) *V* denotes the voltage, and *I* denotes the input stimulation to the neural population (Hong, 2011):

We select parameter values that permit chaotic oscillations (as elicited at a “strange attractor”; Izhikevich, 2007), in which the voltage dynamics follow a trajectory consistently around a point in the coordinate space to yield a nonuniform waveform shape (relative to a sinusoid; Cole and Voytek, 2017; Fig. 5*a*,*b*). This rhythm is phenomenologically similar to a slow oscillation measured in the ferret visual cortex LFP (Fröhlich and McCormick, 2010, their Fig. 2A) and in hippocampal LFP in rats (see Ouedraogo et al., 2016, their visual abstract). After simulating data using the *ode45* function in MATLAB, we rescale time such that the rhythm has an ∼1 Hz periodicity to match a slow oscillation timescale.

In this generative model, we have access to all the variables underlying the data, i.e., the full system is known, yet we cannot define a singular phase; the definition of phase depends on the voltage events of interest. Visual inspection of the dynamics reveals waves (e.g., between 6 and 7 s in Fig. 5*b*, green rectangle) for which determining whether a full cycle occurs remains unclear. Two potential ways to define a waveform for this system are: (1) a repetition of the high amplitude wave (*t* near 3.5 s in Fig. 3*b*, orange rectangle), or (2) the dynamical system completes a full cycle in the phase space at any amplitude (i.e., including the small amplitude rotations in Fig. 5*a*, and *t* near 6.5 s in Fig. 5*b*); we note that waves included in (1) are a subset of waves included in (2).

During times of smaller amplitude waves, all methods infer a complete phase evolution, and the state space and FIR-Hilbert methods suggest increased uncertainty (wider credible/confidence intervals). During higher amplitude waves, the phase cycle estimated from the Poincaré method differs from that from the other two methods (Fig. 5*b*, shaded orange rectangle, black trace). The Poincaré method produces a longer phase cycle for high-amplitude waves but the same phase cycle for lower amplitude waves. Additionally, we note that the state space method has wider credible intervals at the start of each phase cycle (phase at *c*,*d*). Differences between methods decrease when examining times with high-confidence phase estimates (FIR-Hilbert and state space: *c*,*d*).

We conclude that for nonsinusoidal signals with multiple waveforms, as exemplified here by the Fitzhugh–Nagumo model, phase estimates from the different methods tend to agree. As in the previous two simulation studies, considering only time-indices where we have high confidence in the phase estimates improves similarity across all methods, including to the Poincaré method. However, an important distinction between methods becomes more apparent in this simulation; the state space and FIR-Hilbert methods yield nonuniform phase evolution, i.e., a nonlinear ramp in the phase (see Fig. 5*b*, orange rectangle). By contrast, the Poincaré method necessarily produces linear increases in the phase. Depending on the desired interpretation of phase, as beginning to evolve only when the voltage begins to rise (state space and FIR-Hilbert methods) as opposed to evolving purely as a function of time (Poincaré method), different methods are appropriate.

### Confidence in phase estimates within method does not necessarily mirror phase differences across methods

In the preceding analyses, we considered both within method measures of confidence and between method differences. We now consider the importance of applying both approaches. To do so, we analyze the simulated data at different confidence/credible interval thresholds. For each threshold, we select the subset of data with confidence/credible intervals below the threshold; when the threshold is low, we are selecting only the data with the most confident phase estimates. For each threshold, we examine the cross-method phase differences for the subsets of data by computing the circular SDs. In some cases, we find a near linear monotonic relationship. For example, for the broadband rhythm in pink noise (Fig. 6*a*), we find smaller deviations between methods at lower thresholds, i.e., estimate agreement across methods increases with confidence in the phase estimate. However, in other cases, we find a more complex relationship between within method confidence and between method differences. For the AR(2) and Fitzhugh–Nagumo generative models, increasing within method confidence does not necessarily decrease the between method phase difference (Fig. 6*b*,*c*); in some cases, the disagreement in phase estimates between methods increases as the within method confidence increases. These results indicate that different methods can yield phase estimates that become more similar, or more dissimilar, with increasing within method confidence. This nonmonotonic relationship and difference across generative models highlights the importance of considering both within method confidence/credible intervals and between method comparisons. Doing so allows a better understanding of the reliability and consistency of phase estimates across methods, as shown here for varying signal-to-noise regimes (Fig. 6*a*,*b*) and nonlinear waveforms (Fig. 6*c*).

### Example *in vivo* applications

For example applications to *in vivo* data, we focus on two common analysis approaches that require phase: coherence (Pesaran et al., 2018) and phase-amplitude coupling (Tort et al., 2010; Hyafil et al., 2015; see Materials and Methods). We analyze 400 s of data from two frontal electrodes with visually identifiable rhythms from an anesthetized macaque monkey (Fig. 7*a*,*b*; for details, see http://wiki.neurotycho.org/Anesthesia_Task_Details). Extensive analyses of these data have been conducted by Ma et al. (2019) where those authors demonstrated long range (between frontal and parietal regions) coherence in the slow δ (0.25–1.5 Hz) and local phase-amplitude coupling between rhythms in the slow δ (0.25–1.5 Hz) and β bands (defined by Ma and colleagues as 15–25 Hz).

#### Analysis of coherence

We apply all three phase estimation methods to compute the coherence in the slow δ band. We compare coherence results from two sets of data: all the data, and a subset of the data with “high confidence” in the phase estimates. We define “high confidence” intervals as times included when thresholding for the lower quantile of confidence bounds for the FIR-Hilbert method or lower quantile of credible intervals for the state space method. Using either the FIR-Hilbert or the state space method leads to a sharp increase in the estimated coherence for the high-confidence phase data (coherence 0.7) compared with all data (coherence 0.3). We note that we cannot compare the coherence estimate using the Poincaré method to estimates from the other methods; the Poincaré method does not estimate an amplitude time series (see Materials and Methods). However, we can compare the coherence estimated using the Poincaré method (see Materials and Methods, Analysis of *in vivo* data) across the two subsets of data. In this case, we define high confidence intervals as times at which FIR-Hilbert and state space phase estimators are simultaneously associated with low statistical uncertainty. We again find that the coherence increases from 0.15 for all data to 0.58 for the high confidence intervals of data (Fig. 7*c*, right column). Examining the coherence estimates for subsets of data across a range of percentile thresholds for the credible/confidence intervals, we find that the coherence increases as the percentile threshold decreases; as confidence in the phase increases, so does the coherence. Because the coherence results can depend on the percentile threshold, we recommend reporting results across the range of percentile thresholds. We conclude that, for all three methods, the estimated coherence increases during time indices of high confidence phase estimates. This conclusion is not obvious; during time indices with high-confidence phase estimates, the phases and amplitudes for the slow δ rhythm could remain randomly distributed between recordings. In addition, we note that isolating the high confidence phase data produced bimodal distributions for the phase differences between electrodes with the FIR-Hilbert and state space methods, a result not clear when analyzing all data (Fig. 7*c*). A bimodal phase-difference distribution (as seen in Fig. 7*c*) with one mode (peak) near 0° could imply a common source acting on both areas (Nunez and Srinivasan, 2006), and another mode near 30° could reflect a delayed signal propagating between electrodes. We conclude that coherence estimates can be better interpreted by using segments of data with confident phase estimates.

#### Phase-amplitude coupling

In a second example analysis of the *in vivo* data, we use only the FIR-Hilbert and state space methods to estimate the phase-amplitude coupling between the slow δ phase and β amplitude (see Materials and Methods, Analysis of *in vivo* data). We note that the Poincaré method is unusable for phase-amplitude coupling estimation because of the lack of an estimated amplitude time series. We expect the low-frequency δ phase to influence the high-frequency β amplitude (Ma et al., 2019). We observe this effect when analyzing all of the data (Fig. 8), and find that considering only subsets of data with high confidence in the phase estimate increases the effect size (Fig. 8*c*); there is greater modulation (ratio between the maximum and minimum mean amplitudes across all phase bins) of the amplitude of the β band rhythm as a function of the slow δ phase when considering only intervals with high confidence phase estimates. However, for small percentile thresholds (i.e., when confidence in the phase estimate is high, and smaller subsets of data are used for analysis), we can visually identify fluctuations in the modulation as less data are included to create the phase-amplitude plot. Like the coherence results, we recommend reporting phase-amplitude coupling metrics across the range of percentile thresholds. These phase-amplitude coupling results are consistent across the FIR-Hilbert and state space methods, demonstrating the robustness of this effect to model misspecification.

## Discussion

In this manuscript, we demonstrated the ambiguity in specifying phase for neuroscientific applications. We used simulations from three empirically valid generative models of rhythms and applied three methods to specify phase. Together, these models and methods encompass a wide range of empirical data and methods applied (Bruns, 2004; Ouedraogo et al., 2016; Cole and Voytek, 2017; Donoghue et al., 2020; Wodeyar et al., 2021; Spyropoulos et al., 2022). We showed that for each model of rhythms, each method produced phase estimates that optimally tracked different characteristics of the rhythms. These phase estimates converge when using only segments of data when the methods suggested greater confidence in the phase estimate. Using this insight, we applied two common analyses, coherence and phase-amplitude coupling, to example *in vivo* recordings. Considering only the subset of data with high confidence phase estimates, we found increases in both the coherence and phase-amplitude coupling, and a bimodal phase-difference distribution for the coherence that was not apparent in the original analysis. We conclude that phase is ambiguous (i.e., application specific) for neural data but using uncertainty metrics reduces the ambiguity.

Every method to specify phase imposes a model of the data time series. In practice, phase estimation is performed without careful consideration of the model (implicitly) selected. For the estimation methods considered here, we propose a decision tree to guide practitioners in choosing a phase estimator. In essence, the method chosen depends on the goal of the analysis. In the ideal case of sufficiently sampled sinusoidal data (Lainscsek et al., 2017) and no noise process, the FIR-Hilbert, state space, and Poincaré methods are all viable to identify and track inferable rhythms. If the data time series periodicity is nearly sinusoidal and noise contamination is low, then both the FIR-Hilbert and state space methods generate similar phase estimates and are not susceptible to high-frequency fluctuations like those seen using the Poincaré method. Alternatively, if there is amplitude modulation, or if the data are nonsinusoidal and noise contamination is high and the goal is to identify phase for periods with high SNR waves, then the state space method is the best option. Indeed, we note that most neural data contain noise and amplitude modulation. Finally, if the data are nonsinusoidal and the noise contamination is low, then the Poincaré method performs better at maintaining a linear ramp for the phase compared with the FIR-Hilbert and state space methods.

When phase estimation confidence is high, we typically assume that the method used to estimate phase describes the data accurately. However, high confidence does not necessarily indicate that the method accurately models the data. For instance, when a large amplitude, sudden change (i.e., a spike) occurs in the voltage time series, the Hilbert transform confidence will be high, but the FIR-Hilbert does not provide an accurate model of the data; using this method, the “phase” of the spike will depend on the filter passband. In this case, cross-method comparisons may indicate high variability across methods despite the high confidence. Low confidence in the phase estimate may occur because of an inappropriate model choice or data that do not capture identifiable rhythms (e.g., because of noise or geometry of sources). In such cases, careful inspection of the data (e.g., additional descriptive analyses to confirm the presence of a rhythm) and consideration of other phase estimation methods may help confirm whether the phase can be defined as desired by the practitioner.

The features tracked by the phase are least dependent on the experimenter’s choice of estimation method when the similarity between methods is high. Across different models of rhythms, we found uncertainty metrics help in identifying segments of time when phase is similar across methods. This similarity across methods does not map linearly to the metrics of uncertainty but depends on the model studied (see Results, Confidence in phase estimates within method does not necessarily mirror phase differences across methods). However, the metrics of uncertainty are specific to the model assumed by each method. The uncertainty metric under the FIR-Hilbert method is a transformation of the amplitude time series for the data passed through a narrowband filter. One can identify instances when this measure of uncertainty does not represent features of the rhythm of interest. For example, when the time series is contaminated with

While we focused on three methods to specify phase, other methods exist. We highlight a few alternatives. Sophisticated approaches to phase estimation exist in the mathematical physics literature, e.g., analysis of the mean first return time (Schwabedal and Pikovsky, 2013; Cao et al., 2020) or slowest decaying modes of the Kolmogorov backward operator (Thomas and Lindner, 2014). As these approaches are debated (Pikovsky, 2015; Thomas and Lindner, 2015), and complementary approaches developed (Engel and Kuehn, 2021), we expect their adoption and application in neuroscience will provide additional strategies to estimate phase from neural data. A recently introduced approach, the generalized phase method (Davis et al., 2022), estimates the smoothed Hilbert-transform derived phase of the largest fluctuation in the data at a given time instant. This method prioritizes the rhythm with highest amplitude present in the time series and assumes only a single rhythm is present at any instant in time. Given the similarity of this method to the basic Hilbert transform and the intent to examine distinct types of phase estimation methods, we chose not to apply this method here. Another method, the masked empirical mode decomposition or masked EMD (Quinn et al., 2021), seeks to identify distinct waveforms in the time series. Such a method could closely track an inferable rhythm while reducing the contamination of noise and create a phase evolution that linearly tracks a nonsinusoidal rhythm. The masked EMF may struggle with identifying a singular waveform if the underlying limit-cycle attractor of the dynamical system measured is chaotic (as seen in the Fitzhugh–Nagumo simulation), leading to wave morphology that is inconsistent across cycles.

A major difficulty in identifying the best phase estimation method is that the quantity defined as phase depends on the goals of the practitioner. Thus, the idea of a “best” phase estimator is not meaningful. Instead, we recommend alternative approaches of comparing estimators, for example by comparing phase from different methods to identify consistency or testing which method of phase estimation shows the highest effect size for the feature of interest. For example, Davis and colleagues (Davis et al., 2022) compare the phase from the FIR-Hilbert method and that under the generalized phase method to the multiunit neuronal spiking rate, demonstrating increased spike-field coherence under the generalized phase method. However, choosing the phase across methods that maximizes the relationship to neuronal activity or behavior may suffer from “voodoo correlations” (Vul et al., 2009): increases in effect size because of random chance in the method selected. Similarly, the removal of data with low confidence should be done cautiously and with a thorough understanding of the possible implications on the overall conclusions, including voodoo correlations. Further, when presenting the results, it is important to clearly communicate the limitations of the analysis and the potential impact of the removed data on the conclusions drawn. We recommend that studies present the results of phase analysis across a continuum of thresholds on the uncertainty metrics to demonstrate the implications of analyzing different subsets of data (as shown in Figs. 7*d*, 8*c*). Doing so may reveal facets of the data, for example, the presence of bi-modality in the phase differences, that may otherwise be obscured by inaccurate phase estimates (Fig. 7*c*). The true efficacy of a method to define phase that tracks the desired feature of population neural activity (such as the peaks or troughs) could be verified by causal approaches, such as stimulation.

Closed-loop approaches to noninvasive (Ngo et al., 2013; Klinzing et al., 2021) and invasive (Fröhlich and McCormick, 2010; Lustenberger et al., 2016; Zrenner et al., 2018) stimulation based on a signal’s phase require methods for real-time phase estimation. For example, auditory stimulation triggered by the phase of a slow-wave oscillation improves memory (Ngo et al., 2013) and thalamic stimulation triggered by a patient’s tremor rhythm suppresses tremor in patients with Parkinson’s disease (Cagnan et al., 2017). The Poincaré and FIR-Hilbert methods do not directly support causal phase estimation, precluding their real-time application. However, several methods have been developed to perform real-time phase estimation using the Hilbert transform by first forecasting data under autoregressive models (Blackwood et al., 2018; Schatza et al., 2022), and a similar approach may support real-time implementation of the Poincaré method. The state space method is easily translated to a real-time framework (Wodeyar et al., 2021). Finally, similar to what we showed in the *in vivo* analyses, measures of uncertainty can help guide closed-loop stimulation to times of greater certainty in the phase estimate (McNamara et al., 2022; Schatza et al., 2022).

We note that in this manuscript we focused on the case of a single rhythm. However, the state space method supports simultaneous estimation of oscillators for multiple rhythms (Wodeyar et al., 2021; He et al., 2022). By applying the state space method to estimate multiple oscillators, we expect the state space method could provide a principled approach to phase estimation across multiple underlying oscillation patterns. The likelihood derived from the state space model can serve as an explicit indicator of which oscillation is dominant at any given time. This approach can help mitigate the risks associated with censoring data based on confidence levels and offer, for example, a more accurate representation of the coupling between spikes and neural signals in the presence of multiple rhythms. In this way, state space models may contribute to a more comprehensive analysis of the data, considering the dynamic presence and contribution of different rhythms, and ensuring that the conclusions drawn are not biased by an overemphasis on high-confidence estimates.

Phase is an important characteristic of neural activity that may be used to index cortical excitability (Freeman, 1961), availability of attentional resources (Busch et al., 2009; Fiebelkorn et al., 2019), and the neural population capacity for communication (Fries, 2015). As such, knowledge of how different phase estimators define phase is critical to address mechanistic questions in neuroscience. In this manuscript, we proposed guidelines to identify the appropriate phase estimator, namely applying estimation methods whose reconstructed phase tracks the relevant characteristics of the rhythm of interest and using metrics of statistical uncertainty. Improved approaches to phase estimation will help us better understand the network implications of rhythms and ascertain the potential for noninvasive interventions that alter cognitive ability (Ngo et al., 2013) or reduce clinical symptoms (Cagnan et al., 2017).

## Footnotes

The authors declare no competing financial interests.

A.W., M.A.K., F.A.M., and U.T.E. were partially supported by National Institutes of Health Grants R01NS110669 and R01EB026938.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.