The demodulated band transform

doi:10.1016/j.jneumeth.2015.12.004

Journal of Neuroscience Methods

Volume 261, 1 March 2016, Pages 135-154

https://doi.org/10.1016/j.jneumeth.2015.12.004 Get rights and content

Abstract

Background

Windowed Fourier decompositions (WFD) are widely used in measuring stationary and non-stationary spectral phenomena and in describing pairwise relationships among multiple signals. Although a variety of WFDs see frequent application in electrophysiological research, including the short-time Fourier transform, continuous wavelets, bandpass filtering and multitaper-based approaches, each carries certain drawbacks related to computational efficiency and spectral leakage. This work surveys the advantages of a WFD not previously applied in electrophysiological settings.

New methods

A computationally efficient form of complex demodulation, the demodulated band transform (DBT), is described.

Results

DBT is shown to provide an efficient approach to spectral estimation with minimal susceptibility to spectral leakage. In addition, it lends itself well to adaptive filtering of non-stationary narrowband noise.

Comparison with existing methods

A detailed comparison with alternative WFDs is offered, with an emphasis on the relationship between DBT and Thomson's multitaper. DBT is shown to perform favorably in combining computational efficiency with minimal introduction of spectral leakage.

Conclusion

DBT is ideally suited to efficient estimation of both stationary and non-stationary spectral and cross-spectral statistics with minimal susceptibility to spectral leakage. These qualities are broadly desirable in many settings.

Introduction

Spectral analysis is a tool of fundamental importance for electrophysiological research, used to characterize both the frequency-dependent properties of physiological signals (Bruns, 2004, Hlaswatch and Boudreaux-Bartels, 1992, Mitra and Bokil, 2008, Schiff et al., 1994) and the pairwise interactions between multiple signals through cross spectra (Baccalá and Sameshima, 2001, Carter, 1987, Granger, 1969, Granger and Hatanaka, 1964, Lachaux et al., 1999, Le Van Quyen et al., 2001, Young and Eggermont, 2009). The most commonly used techniques for computing spectra rely in one way or another on the Fourier transform alongside windowing in the time or frequency domains, known as “windowed Fourier decompositions” (WFD). Alternative WFDs generally fall into two broad classes: those that begin with a bank of bandpass filters, effectively windowing the signal in the frequency domain, and those that estimate a local spectrum after windowing in the time domain. These two classes are, in fact, formally equivalent, a point often emphasized in signal processing literature (Allen and Rabiner, 1977, Bruns, 2004, Hlaswatch and Boudreaux-Bartels, 1992, Le Van Quyen et al., 2001), but practical details of implementation cause them to segregate in a number of important ways, with tradeoffs related to computational efficiency, the introduction of windowing-related artifacts, and invertibility of the underlying transformation (Bruns, 2004). The researcher should understand and consider such tradeoffs when choosing an appropriate WFD for a given problem.

The present work gives a brief tutorial review of these considerations in order to motivate and lead into the discussion of a simple and efficient approach to windowed Fourier analysis, herein called the “demodulated band transform (DBT)”, which, to the authors’ knowledge, has not previously been applied to electrophysiological recordings. To give a foretaste, the logic of the DBT can most easily be grasped by comparing it to the standard short-time Fourier transform (STFT), which is a common and efficient means to a time-frequency decomposition. The STFT is normally computed by segmenting a signal into multiple overlapping time windows and then applying the fast Fourier transform (FFT) to each segment to obtain a local spectral decomposition (Fig. 1B). DBT takes a very similar approach, except that it comes at the problem from the frequency domain: first the FFT is applied to the whole signal, then the result is segmented into overlapping finite frequency intervals, from which the final decomposition is obtained by applying the inverse FFT separately to each interval (Fig. 1A). As with STFT, the result is a time-frequency decomposition whose coefficients describe the signal energy and shape within some local window. The essential difference between the two approaches has to do with the shape and duration of the respective windows. While time-segmented STFT explicitly applies a window in time, the time window used by DBT is an implicit consequence of the frequency window. In what follows, we consider what this means for the resulting spectral estimate and why one might prefer one approach to the other.

Although the essential details of this technique were described many years ago by Bingham et al. (1967), it appears to have been overlooked in electrophysiological literature. The method has two main selling points, which motivate the present work: first, its computational efficiency is very close to that of conventional STFT, which is at least as good and usually much better than alternative methods such as bandpass filtering with Hilbert transform (BPHT), continuous wavelet transforms (CWT) and methods that employ multitapering, and second, it has very low susceptibility to spectral leakage, which can be a significant drawback of standard time-segmented STFT. We describe how these advantages may be useful, concluding that (1) DBT provides an efficient representation of band-limited data giving (2) a readily interpreted windowed Fourier time-frequency decomposition which (3) can be used to efficiently compute spectral and cross-spectral statistics on long-duration multi-channel recordings. DBT also brings secondary advantages, such as more flexible handling of bandwidths, which may vary with frequency, making it, for example, a simple matter to compute an efficiently-sampled continuous wavelet decomposition. It also effectively compresses band-limited data by downsampling to a rate that matches the inherent resolution of a given bandwidth, while preserving basic spectral properties of the signal in a directly usable form.

The remainder of this work is organized along two parallel tracks; the first continues a schematic overview of the DBT for readers interested in the general rationale and flavor of the technique, while the second lays out the method in greater detail. Readers knowledgeable about signal processing theory may wish to skip through some of the introductory sections, while those interested in the more schematic overview may want to skip some of the technical discussion found in Section 2.

Perhaps the most obvious reason to window a signal before applying a Fourier transform is to limit the spectral decomposition to a given time interval, thereby gaining information about spectral content around some specified moment in time. Repeated over multiple time intervals, this application of windowing results in a “time-frequency decomposition (TFD)”, used for example to show how a signal's spectrum changes over time. Familiar techniques for obtaining time-varying spectra, such as continuous wavelet transforms (CWT), short-time Fourier transforms (STFT), and bandpass filtering followed by Hilbert transform (BPHT) all belong to the larger family of WFDs. However, the relevance of WFDs does not stop at time-frequency analysis, as windowing has multiple uses beyond localizing spectral energy in time; these include suppressing artifacts and improving the statistical properties of stationary (non-time-varying) spectral and cross-spectral estimates, as well as filtering. WFDs are therefore a basic element of stationary spectral estimators, such as Thomson's multitaper (TMT) and Welch's window-overlap segment averaging method (WOSA), which operate, in essence, by averaging together multiple separate windowed power spectral estimates. They are also important for measures of frequency-dependent correlation between two signals, such as coherence and phase locking, which require observing how the signals covary over multiple samples.

Broadly speaking, because data in the real world can only be observed over some finite period of time at finite resolution in time, all real-world applications of Fourier analysis subject a signal at least to an observation window and some manner of band-limiting filter (Slepian, 1976), meaning that WFD is more-or-less synonymous with any practical application of Fourier analysis, encompassing essentially all Fourier-based techniques. In spite of their diversity, all of these techniques share fundamental considerations related to windowing. Differences between them come down to the properties of windows and the manner in which the transform is computed. The choice of window function determines important statistical properties of the spectral estimate including bias – the degree to which the spectrum deviates systematically from the “true” spectrum – and error variance, the degree to which the estimate is affected by random noise. For example, the smoothness of the spectral estimate, or “broadband bias,” is governed by the scale of the data window in time: the shorter the window duration the smoother the estimate and the lower its ability to resolve details in the signal spectrum (see Fig. 2). On the other hand, a smoother estimate will often result in lower statistical noise, creating a tradeoff between smoothing bias and error variance.

Matched for smoothness, different techniques for computing spectra often (and ideally) yield similar results (Bruns, 2004, Le Van Quyen et al., 2001), but at times they may also diverge in ways that are not subtle, if not with respect to the estimate itself, then to the speed and efficiency with which it can be obtained. Three primary considerations therefore influence the choice of WFD: the nature of spectral artifacts introduced by windowing, statistical properties of the resulting spectral estimate, and the computational cost of obtaining it. Following a tutorial review of a number of essential concepts, we will show how the DBT performs favorably by each of these measures.

As just noted, the choice of window for a WFD determines the bandwidth of the decomposition and the smoothness (broadband bias) of any resulting spectral estimate. Because the goal of spectral analysis is usually to resolve particular spectral features of interest, the choice of bandwidth is the first and most important consideration in choosing between WFDs. Common to all WFDs is a tradeoff between smoothing in time and frequency, according to which resolution in time must be inversely related to resolution in frequency. A narrow time window trades specificity in time for low frequency resolution, reflected in the smoothness of the spectral estimate. Similarly, a very narrowband filter gives high frequency resolution at the price of low time resolution (Fig. 2).

Because the spectral analysis of noisy electrophysiological signals frequently involves some form of smoothing or averaging over time intervals to obtain a statistically consistent estimate (Welch, 1967), a corollary of this tradeoff is that greater frequency resolution entails fewer effective data points for such an average, meaning greater variance of the spectral estimate. Greater frequency resolution therefore comes at the cost of both time resolution and statistical power, another important consideration in choosing optimal bandwidths. The statistical tradeoff also holds true for WFDs, such as Thomson's multitaper, that do not explicitly resolve the spectrum in time in time. For these, wider bandwidths still involve a larger effective sample size, hence lower statistical noise (Walden, 2000).

In some cases, the goal is to separate the signal into several components occupying different bands, whose bandwidths may tend to vary as a function of frequency. For example, it is often natural for bandwidth and center frequency to scale with each other, as both scale directly with dilations or contractions in time. CWT and related wavelet methods conform naturally to this relationship by decomposing a signal into logarithmically spaced bands (Le Van Quyen et al., 2001). One advantage of windowing in the frequency domain is that bandwidths can be adjusted more flexibly as a function of frequency, whereas for time-segmented STFT bandwidth is normally limited to a fixed value.

The tradeoff between time and frequency resolution is a consequence of the fact that windowing in time smears energy in the spectral domain and vice versa. This happens because the spectrum of the windowed signal is given by the convolution of the separate spectra of the signal and window function; in other words, the signal spectrum becomes “blurred” by the spectrum of window. In addition to broadband bias, this effect leads to “spectral leakage,” the spread of signal energy into frequency bands far removed from its original band. Fig. 3 illustrates the reason for this effect. The severity of spectral leakage depends on the window: windows that contain very sharp boundaries, implying a broad spectrum, produce the worst distortion, and so it is common practice to reduce the extent of spectral leakage by applying windows with tapered edges whose spectra are relatively more concentrated within a narrower bandwidth. Tapering reduces the energy in broadly extended side lobes in exchange for lower frequency resolution of the main lobe, in other words, increased broadband bias, which reflects the loss of information at window edges (Harris, 1978). Because it determines the frequency resolution of the analysis, broadband bias is tolerable so long as it allows the frequency components of interest to be resolved. Spectral leakage, in contrast, creates spectrally complex and extended artifacts (see Fig. 3), which may at times mask or mimic phenomena of interest. For this reason, it is common to sacrifice local resolution in order to suppress spectral leakage by using a window with smoothly tapered edges.

In considering artifacts introduced by spectral estimators, it is worth distinguishing two sources of windowing bias. Strictly speaking, Fourier analysis assumes a signal of infinite duration, yet signals in the real world can only be observed for a finite amount of time. This global observation window imposes, in effect, a sharply bounded square window on the data, creating the first source of bias. On top of the global window, WFDs apply an analysis window, creating a second opportunity for spectral distortion. One strategy to mitigate the first source is simply to obtain a longer recording: within a fixed bandwidth, the longer the observation window, the smaller the relative contribution of artifactual energy at the window edges to total signal energy. If the bandwidth of interest is wide relative to the duration of the signal (BW ≫ 1/T) then distortion caused by the global observation window will often be negligible. For electrophysiological recordings, this condition very often holds, as physiologically relevant bandwidths, which may be on the order of 1 Hz or greater, are broad compared to the typical duration of a recording, which may be upwards of several hundreds or thousands of seconds, easily satisfying BW ≫ 1/T.

The second source of bias depends only on the analysis window and is not affected by the duration of the recording. Bias of this type is built into the estimator itself and will manifest regardless of how the global signal is treated. In particular, any method, such as time-segmented STFT, that applies an analysis window with duration less than the observation window must introduce spectral leakage. Although segmenting a signal into shorter time windows gives a computationally efficient and intuitive approach to estimating time-varying spectra, in many settings, the spectral leakage inherent in the estimator creates a significant drawback. This fact has curtailed the usefulness of time-segmented STFT as a generally reliable technique and accounts in part for the popularity of computationally more expensive alternatives such as BPHT and CWT. These methods may minimize spectral leakage by applying analysis windows with smoothly decaying tails, which can extend over the entire duration of the recording.

One of the most effective windowing techniques for suppressing spectral leakage, Thomson's multitaper (TMT), formulates the issue as an optimization problem solved by the singular value decomposition of a matrix (Thomson, 1982). The optimized quantity is the proportion of signal energy leaked outside a specified bandwidth, and the solution to the problem depends on the product of this bandwidth and the duration of the observation window, the “time-bandwidth product,” which is a parameter chosen by the user. The eigenvectors returned by the decomposition give a series of time windows that optimize the tradeoff between spectral leakage, smoothing bias and loss of information; these windows are known as the discrete prolate spheroidal sequence (DPSS). The method's familiarity to electrophysiologists owes perhaps in large part to its implementation in the Chronux toolbox (Mitra and Bokil, 2008). To highlight some of the advantages of DBT we will compare it to TMT, pointing to similarities in how the two methods approach the problem of spectral leakage.

TMT and related multitapering techniques offer a principled framework for addressing spectral leakage (Walden, 2000), however, their effectiveness still depends on the relationship between the bandwidth resolution and window duration, so that the user must put some thought into choosing these parameters. For example, if one specifies a resolution of 1 Hz over a 1 s observation window, TMT will use a single sharply bounded taper, which, despite being optimal for the specified bandwidth and duration, does little to suppress spectral leakage. On the other hand, TMT estimates with a resolution of 1 Hz over a recording of several minutes duration may achieve excellent suppression of spectral leakage but at the expense of a computationally prohibitive number of tapers to retain a sufficient amount of information in the signal. It is therefore not uncommon to find TMT applied to short time intervals in what is effectively a hybrid of time-segmented STFT and TMT. In these cases, the analysis window duration and TMT bandwidth are two separate parameters that must be selected by the user, both of which affect susceptibility to spectral artifacts. It is therefore important for the user to recognize that multitapering, by itself, does not give a free pass to ignore spectral leakage, rather the choice of window duration and bandwidth setting requires some understanding of the relationship between the two.

As with any technique, the usefulness of multitapering depends on the goal of the decomposition: while TMT provides only a stationary estimate of the power spectrum within a given window, other WFDs lend themselves directly to non-stationary estimates. A common strategy to obtain time-varying power-spectral estimates with TMT uses the hybrid TMT-STFT approach, applying TMT to shorter-duration windows to construct a spectrogram (Thomson, 1990). Finally, because TMT returns a power-spectral estimate, it is not an invertible transform, whereas BPHT, STFT and CWT return coefficients from which it will often be possible to recover the original signal. This difference is important in applications to adaptive filtering.

Computational efficiency often varies greatly between different WFDs, depending on context. While one approach may not be practical in one setting, another might require a comparatively trivial number of calculations to yield essentially the same information. The computational cost of filter-bank methods such as BPHT and CWT, as most commonly applied, scales with the number of bands, so that better frequency resolution comes at greater cost. For the power spectral estimates provided by TMT, the tradeoff is opposite: computational cost increases with decreasing frequency resolution because an estimate over a wider bandwidth averages over a greater number of tapers. For example, to compute a standard TMT power spectral estimate over a 10 s window with a frequency resolution of 0.1 Hz requires a single taper, but 19 tapers for 1 Hz resolution, and 99 for 10 Hz, and so on. On the other hand, for a signal sampled at 200 Hz, a stationary power-spectral estimate obtained by averaging BPHT over time requires 1000 filter bands for 0.1 Hz resolution, 100 bands for 1 Hz resolution and 10 bands for 10 Hz resolution and so on. It clearly makes little sense in this example to use TMT for a spectral estimate with 10 Hz resolution when nearly equivalent information can be obtained much more efficiently with BPHT, and it makes as little sense to use BPHT in preference to TMT for an estimate with 1 Hz resolution. Computational advantages in favor of one approach over the other therefore depend on signal duration and target bandwidth of the spectral estimate.

Windowed Fourier decompositions typically encode the signal less efficiently than the starting time-domain representation, with more samples occupying more memory than the original signal. Such oversampling is the root cause of the inefficiencies of the foregoing examples. Moderate oversampling is often useful to make the spectral representation more complete and to give it a visually appealing smoothness, but as the example illustrates, TMT and filter-bank methods may at times drastically oversample to a degree that achieves little beyond poor computational performance. Inefficiency of this sort becomes especially burdensome when calculating pairwise statistics within large arrays of recordings, such as coherence or phase locking, as then the number of computations also scales with the square of the number of channels. It is amplified still more when measuring cross-frequency interactions between channels, such as phase–amplitude coupling, for which each pairing of channel and frequency adds to a combinatorial growth in pairwise interactions (Dvorak and Fenton, 2014). Levels of oversampling that are merely inconvenient when handling one channel may become prohibitive in these applications.

In contrast to TMT and BPHT, methods that segment the signal into shorter time windows, such as time-segmented STFT and windowed overlapping segment averaging (WOSA) (Welch, 1967), are computationally efficient and pose no similar tradeoff between bandwidth and efficiency. But as already noted, spectral-leakage artifacts often make these methods inadmissible. We will show that this tradeoff is avoided with DBT, which achieves minimal susceptibility to spectral leakage and computational efficiency by segmenting the signal in frequency rather than time. To explain how this works, we next review properties of complex-valued signals.

Fourier-based spectral estimation creates a representation of the signal as an array of complex coefficients, each of which is composed of a real part and an imaginary part, c = a + ib, where i is the imaginary number for which i² = −1. The end result of Fourier analysis is often a real-valued power spectrum, and for many researchers the intermediate complex spectral representation may remain a mathematical detail, not given much thought. Yet this representation, its reasons and properties, are central to understanding any Fourier-based technique, making a brief review here worthwhile. To begin, it is useful to think of complex values as vectors in a two-dimensional plane with one axis giving the real part and the other the imaginary part (Fig. 4A) The squared length of the vector, c² = a² + b², is found by multiplying c with its complex conjugate, which refers to the value with reversed sign in the imaginary part: $| c |^{2} = c c^{*} = (a + i b) (a - i b) = a^{2} - i a b + i a b - i^{2} b^{2} = a^{2} + b^{2}$ Through some basic trigonometry, a complex number may be described by its magnitude and the angle it makes with the real axis, or “phase angle,” ϕ: $c = | c | (\cos ϕ + i \sin ϕ)$ Taking the product of two complex numbers gives a third complex value whose magnitude is the product of the original magnitudes and, perhaps less obviously, whose phase is the sum of the starting phase angles. This can be seen by applying basic trigonometric identities as follows: $c_{1} c_{2} = | c_{1} | | c_{2} | [\cos ϕ_{1} \cos ϕ_{2} - \sin ϕ_{1} \sin ϕ_{2} + i (\cos ϕ_{1} \sin ϕ_{2} + \sin ϕ_{1} \cos ϕ_{2})] = | c_{1} | | c_{2} | (\cos (ϕ_{1} + ϕ_{2}) - i \sin (ϕ_{1} + ϕ_{2}))$ Simplifying the trigonometric expressions above with Euler's formula, e^iϕ = cos ϕ + i sin ϕ, makes the relationship still more plain: $c_{1} c_{2} = | c_{1} | e^{i ϕ_{1}} |c_{2}| e^{i ϕ_{2}} = | c_{1} | | c_{2} | e^{i (ϕ_{1} + ϕ_{2})}$ This additive property of phase accounts for why complex numbers appear so prominently in the analysis of oscillatory signals: the evolution of a periodic signal can be described naturally by the hand of a clock, which returns to its starting position after winding through 360 degrees of angle. Such traversal of angular space can be described very efficiently through the products of complex values. In particular, the evolution of phase over time can be understood as a series of multiplicative increments, which leads naturally to an exponential function of time, e^iϕ(t). In the simplest case, a sinusoid with constant frequency and amplitude, corresponding to a clock hand moving at constant angular speed, ω, we have ϕ(t) = ωt. More generally, a band-limited complex signal can be considered as the product of a real-valued envelope, A(t) and an evolving unit-length phase vector whose angle advances over time at some rate, the “instantaneous frequency,” which varies around some central tendency, or “carrier frequency,” ω_c: $x (t) = A (t) e^{i Δ ϕ (t) + ω_{c} t}$ Here, the time derivative (d/dt)Δϕ(t) gives the deviation of the instantaneous frequency from the carrier frequency, ω_c.

Real-valued signals can be decomposed rather trivially as the sum of two complex-valued signals whose imaginary parts have opposite sign, therefore canceling under summation. Fourier-based techniques apply this trick to decompose the signal into a sum of narrow-band components, whose real and imaginary parts are related by a 90-degree shift of phase (Fig. 4). The bookkeeping needed to convert between a real signal and its complex Fourier representation is conveniently handled by allowing frequency to range over negative values: for real-valued signals, Fourier coefficients at negative frequencies are identical to those at positive frequencies, except for a sign reversal of the imaginary part, allowing the imaginary part to vanish in the summation that reconstitutes the original signal. The sign reversal comes about because the imaginary component at positive frequencies is −90 degrees out of phase from the real component while it is +90 degrees out of phase at negative frequencies, leading to a net 180 degree phase difference between the two.

By removing the contribution of negative frequencies, one obtains a complex-valued representation of the signal, the so-called analytic signal (Fig. 4B). One may therefore think of an analytic signal as the outcome of a filter applied to the original data, which removes the negative half of the spectrum. Because any filter that excludes negative frequencies produces an analytic signal, there is no single unique decomposition, and different WFDs lead to different analytic decompositions.

The analytic representation combines information about signal energy and shape: the squared envelope encodes signal energy around some point in time and frequency, while analytic phase encodes details of the signal shape. This combined encoding of shape and energy into a single complex value gives windowed Fourier decompositions tremendous versatility, as well as relevance that extends well beyond applications to strictly periodic phenomena. For example, while the simple correlation in time between two signals shows how they are linearly related at the same instant, the correlation between two sets of WFD coefficients describes relationships that may be offset in time by an amount on the order of the inverse bandwidth of the decomposition, covering a wider range of possible interactions, which include time delays. By observing such relationships across multiple frequency bands, one may construct an optimal filter that creates an approximation of one signal when applied to the other (Wiener, 1949). This principle is the basis of a variety of methods that seek to infer the causal relationship between signals based on their relationship in time (Baccalá and Sameshima, 2001, Granger, 1969). As this technique applies equally well to periodic and aperiodic signals, it gives one example of how Fourier analysis remains relevant outside the context of purely oscillatory signals.

The complex signal representation considered in the previous section reveals a simple way to translate a signal in the frequency domain, resulting in a second signal with a different carrier frequency. To see this, it can be directly observed that the analytic representation in (4) is composed of the product of two complex signals: $x (t) = A (t) e^{i Δ ϕ (t) + ω_{c} t} = A (t) e^{i Δ ϕ (t)} e^{i ω_{c} t}$

The first, A(t)e^iΔϕ(t), has a spectrum centered at 0 Hz, and the second e^iω_ct is a unit-amplitude complex sinusoid, which oscillates at the carrier frequency, ω_c. Multiplication by the sinusoid has the effect of shifting the frequency range of the first component way from 0 so that the spectrum of the product is centered on the carrier frequency (Fig. 4B–D). Because we might choose to shift the first component to some other carrier frequency through the same operation, it's immediately clear the center frequency can be shifted in any direction simply by multiplying with a complex sinusoid. This operation, shifting the frequency range of a signal, is known as “complex (de)modulation (CD),” and its implementation through multiplication with sinusoids is called heterodyning.

In an important sense, the frequency-translated signal still contains the same information as the original signal, a point perhaps best illustrated by AM radio transmission. The sending side of an AM transmission shifts the output of a microphone from the audible frequency range to the higher range of radio frequency carrier waves with a modulating sinusoid; the receiving side translates the signal back down to the audible range, recovering its original form. The modulated signal retains essential properties of the original signal in two important respects: first, because the modulating term iω_ct vanishes in multiplication by the complex conjugate, a frequency-shifted representation of the signal preserves its original envelope, A(t), and second, for the same reason, two signals that have been modulated by the same amount maintain the same relative phase at each point in time. For this reason, the shape of the power spectrum and cross spectra of signals are preserved under demodulation. CD therefore preserves the basic properties of power and cross spectra.

DBT belongs to a broader class of methods, which employ complex demodulation as part of spectral analysis. The most common versions of CD combine heterodyning with a lowpass filter to recover the real and imaginary components of an analytic signal (Granger and Hatanaka, 1964). For practical purposes the outcome of this operation is equivalent to techniques that apply the Hilbert transform to bandpass filtered data (BPHT). CD has a long history of application to EEG (Childers, 1973, Dick and Vaughn, 1970, Hao et al., 1992, Hoechstetter et al., 2004, Ktonas and Papp, 1980, Levine et al., 1972, Lucas and Harper, 1976, Papp and Ktonas, 1977, Walter, 1968), but the usual implementation through heterodyning is not, in itself, much more efficient than BPHT. The version of CD used by DBT, achieves computational efficiency on par with time-segmented STFT (Bingham et al., 1967), yet appears to have been overlooked in this literature.

DBT accomplishes CD through the method of Bingham et al. (1967) by segmenting the signal directly in the frequency domain. It simultaneously demodulates and downsamples each band by reshaping the discrete Fourier transform of the entire signal into shorter segments, multiplying each segment with a suitable window function, then applying the inverse FFT separately to each segment to yield a series of demodulated, filtered and downsampled bands (Fig. 1A). The resulting signals are demodulated by virtue of the fact that the inverse FFT is blind to the original spectral range of the segment, and therefore treats it as though it were the DFT of a signal with spectral range centered at 0 Hz, sampled at a rate equal to the bandwidth of the segment. The result is equivalent to CD through heterodyning, combined with downsampling to a rate appropriate for the new signal bandwidth. Because the effect of demodulation cancels under conjugate multiplication, the power envelope of the signal within the band and the relative phase among components of a multivariate signal are not affected, other than being downsampled. This approach therefore lends itself directly to measurement of time-varying spectra and cross spectra with sampling that adjusts naturally to the tradeoff between time and frequency resolution. In this way, it parallels ordinary time-segmented STFT, with similar computational advantages; importantly, though, it avoids analysis-window-related spectral leakage. Moreover, as will be shown, the method is comparable and closely related to multitapering techniques explicitly designed to meet the problem of spectral leakage (Thomson, 1982, Walden, 2000).

While the efficient method of Bingham et al. (1967) has not to the best of our knowledge been applied to electrophysiological signals, complex demodulation through heterodyning has a long, if not highly prolific, history of application to EEG. Its initial appeal came from advantages in analog signal processing at a time when bandpass filter design posed a significant technical hurdle (Walter, 1968). CD through heterodyning provides a straightforward means to an analytic representation of the signal within some isolated band of interest, requiring only a single lowpass or bandpass filter alongside a tunable sinusoidal signal generator. The result is formally equivalent to more technically demanding methods that apply the Hilbert transform to bandpass filtered data (Ktonas and Papp, 1980, Papp and Ktonas, 1977). Following a similar rationale, a number of authors have also applied CD in the digital setting to extract analytic representations within a small number of bands of interest (Bruns and Eckhorn, 2004, Childers and Pao, 1972, Clochon et al., 1996, Hao et al., 1992, Hoechstetter et al., 2004). A few authors have also described frequency-domain implementations of CD; however, these differ from the method of Bingham et al. in being applied to isolated bands of interest without segmenting the DFT (Bruns and Eckhorn, 2004, Clochon et al., 1996, Hao et al., 1992). At present, CD is used relatively infrequently in the analysis of EEG, with the notable exception of BESA, a commercial toolbox for EEG and MEG analysis and source localization (Hoechstetter et al., 2004).

The remainder of this work will describe the DBT algorithm in greater technical detail with an emphasis on a theoretical introduction and discussion and with particular attention to the relationship between DBT and Thomson's multitaper (Thomson, 1982, Walden, 2000). Examples with artificial test signals and real human ECoG data illustrate applications to adaptive filtering and time-varying spectral and cross-spectral estimation. We show that DBT can often achieve significant improvements of efficiency over widely used alternative techniques, particularly in the context of large array multi-channel recordings. In general, the number of computations required for pairwise spectral statistics, such as coherence over the full range of sampled frequencies, will be of the same order as calculating a simple correlation. Such a large boost in efficiency may open the door to applications that are impractical with alternative methods.

Although a number of applications are surveyed through examples, the emphasis will remain on an introductory and theoretical overview of the technique rather than a rigorous empirical evaluation and catalog of its applications, mainly because possible applications are too broad to be treated comprehensively. A software implementation of the method is provided for Matlab, so that the reader may run comparisons on applications of interest (available at https://github.com/ckovach/DBT).

Broadly defined, STFT places no constraint on the type or duration of window used (Gabor, 1946, Hlaswatch and Boudreaux-Bartels, 1992), encompassing most WFDs. However, in the context of digitally sampled signals, it is all but universally the case that STFT implies the application of a finite time window of duration much shorter than the total length of the signal, as is also true for WOSA (Welch, 1967). The efficiency of STFT as normally implemented stems from finite time windowing (Bingham et al., 1967); the term will therefore be used here in the narrower sense of time-segmented STFT, while “windowed Fourier decomposition” covers the broad sense.

Section snippets

Materials and methods

At the foundation of modern digital signal processing lies the Whittaker–Nyquist sampling theorem (Mallat, 2009a), which establishes that the information in a signal of finite bandwidth can be fully represented by a series of discrete samples, provided that the sampling rate is not less than the total bandwidth of the signal (counting negative and positive frequencies).¹

Adaptive noise removal

Narrowband noise is a pervasive contaminant of electrophysiological data, arising from a variety of environmental sources. Although 60 or 50 Hz line noise and harmonics are most familiar, it is not unusual to find contamination at other frequencies, notably in ranges that overlap with single- and multiunit responses. Such contamination is particularly common in data obtained in a clinical environment full of electronic sensing and monitoring equipment. Notch filtering is a standard strategy for

Discussion

We have put forward the argument that DBT combines several desirable qualities into a single package, including robustness to spectral leakage, computational efficiency and flexibility, making it appealing as an all-purpose tool for the gamut of spectral analysis applications. But, of course, there must be a catch. Because time and frequency domains are mirror images of each other, segmenting a signal in the frequency domain introduces the time-domain analog of spectral leakage, “time leakage,”

Acknowledgements

The authors acknowledged Richard A. Reale, Hiroto Kawasaki, Rick L. Jenison, Ralph Adolphs, Kirill V. Nourski, Matthew Sutterer and Matthew A. Howard III.

This work was supported by NIH (Grant Nos. NIH R01 DC004290-14, NIH 5R01 AA018736-03) and American Foundation for Suicide Prevention.

References (56)

J. Aru et al.
Untangling cross-frequency coupling in neuroscience
Curr Opin Neurobiol
(2015)
E. Başar et al.
Gamma, alpha, delta, and theta oscillations govern cognitive processes
Int J Psychophysiol
(2001)
A. Bruns
Fourier-Hilbert- and wavelet-based signal analysis: are they really different approaches?
J Neurosci Methods
(2004)
A. Bruns et al.
Task-related coupling from high-to low-frequency signals among visual cortical areas in human subdural recordings
Int J Psychophysiol
(2004)
R.T. Canolty et al.
The functional role of cross-frequency coupling
Trends Cogn Sci
(2010)
D.G. Childers
Complex demodulation of visual evoked responses
Electroencephalogr Clin Neurophysiol
(1973)
P. Clochon et al.
A new method for quantifying EEG event-related desynchronization: amplitude evvelope analysis
Electroencephalogr Clin Neurophysiol
(1996)
M.X. Cohen
Assessing transient cross-frequency coupling in EEG data
J Neurosci Methods
(2008)
D.E. Dick et al.
Mathematical description and computer detection of alpha waves
Math Biosci
(1970)
D. Dvorak et al.
Toward a proper estimation of phase–amplitude coupling in neural oscillations
J Neurosci Methods
(2014)

B.J. He et al.

The temporal structures and functional significance of scale-free brain activity

Neuron

(2010)

M.A. Kramer et al.

Sharp edge artifacts and spurious coupling in EEG frequency comodulation measures

J Neurosci Methods

(2008)

P.Y. Ktonas et al.

Instantaneous envelope and phase extraction from real signals: theory, implementation, and an application to EEG analysis

Signal Process

(1980)

J.-P. Lachaux et al.

Estimating the time-course of coherence between single-trial brain signals: an introduction to wavelet coherence

Neurophysiol Clin

(2002)

M. Le Van Quyen et al.

Comparison of Hilbert transform and wavelet methods for the analysis of neuronal synchrony

J Neurosci Methods

(2001)

D.A. Levine et al.

Evoked potential analysis by complex demodulation

Electroencephalogr Clin Neurophysiol

(1972)

F. Lopes da Silva

EEG and MEG: relevance to neuroscience

Neuron

(2013)

E.A. Lucas et al.

Periodicities in the rate of on-demand electrical stimulation of the mesencephalic reticular formation to maintain wakefulness

Exp Neurol

(1976)

E. Maris et al.

Spatially distributed patterns of oscillatory coupling between high-frequency amplitudes and low-frequency phases in human iEEG

Neuroimage

(2011)

P.P. Mitra et al.

Analysis of dynamic brain imaging data

Biophys J

(1999)

S.J. Schiff et al.

Fast wavelet transformation of EEG

Electroencephalogr Clin Neurophysiol

(1994)

C.K. Young et al.

Coupling of mesoscopic brain oscillations: recent advances in analytical and theoretical perspectives

Prog Neurobiol

(2009)

J.B. Allen et al.

A unified approach to short-time Fourier analysis and synthesis

Proc IEEE

(1977)

L.A. Baccalá et al.

Partial directed coherence: a new concept in neural structure determination

Biol Cybern

(2001)

C. Bingham et al.

Modern techniques of power spectrum estimation

IEEE Trans Audio Electroacoust

(1967)

J.F. Brugge et al.

Coding of repetitive transients by auditory cortex on Heschl's gyrus

J Neurophysiol

(2009)

G. Buzsaki et al.

Neuronal oscillations in cortical networks

Science

(2004)

G.C. Carter

Coherence and time delay estimation

Proc IEEE

(1987)

Cited by (106)

Developmentally sensitive multispectral cortical connectivity profiles serving visual selective attention
2024, Developmental Cognitive Neuroscience
Throughout childhood and adolescence, the brain undergoes significant structural and functional changes that contribute to the maturation of multiple cognitive domains, including selective attention. Selective attention is crucial for healthy executive functioning and while key brain regions serving selective attention have been identified, their age-related changes in neural oscillatory dynamics and connectivity remain largely unknown. We examined the developmental sensitivity of selective attention circuitry in 91 typically developing youth aged 6 – 13 years old. Participants completed a number-based Simon task while undergoing magnetoencephalography (MEG) and the resulting data were preprocessed and transformed into the time-frequency domain. Significant oscillatory brain responses were imaged using a beamforming approach, and task-related peak voxels in the occipital, parietal, and cerebellar cortices were used as seeds for subsequent whole-brain connectivity analyses in the alpha and gamma range. Our key findings revealed developmentally sensitive connectivity profiles in multiple regions crucial for selective attention, including the temporoparietal junction (alpha) and prefrontal cortex (gamma). Overall, these findings suggest that brain regions serving selective attention are highly sensitive to developmental changes during the pubertal transition period.
Better with age: Developmental changes in oscillatory activity during verbal working memory encoding and maintenance
2024, Developmental Cognitive Neuroscience
Numerous investigations have characterized the oscillatory dynamics serving working memory in adults, but few have probed its relationship with chronological age in developing youth. We recorded magnetoencephalography during a modified Sternberg verbal working memory task in 82 youth participants aged 6–14 years old. Significant oscillatory responses were identified and imaged using a beamforming approach and the resulting whole-brain maps were probed for developmental effects during the encoding and maintenance phases. Our results indicated robust oscillatory responses in the theta (4–7 Hz) and alpha (8–14 Hz) range, with older participants exhibiting stronger alpha oscillations in left-hemispheric language regions. Older participants also had greater occipital theta power during encoding. Interestingly, there were sex-by-age interaction effects in cerebellar cortices during encoding and in the right superior temporal region during maintenance. These results extend the existing literature on working memory development by showing strong associations between age and oscillatory dynamics across a distributed network. To our knowledge, these findings are the first to link chronological age to alpha and theta oscillatory responses serving working memory encoding and maintenance, both across and between male and female youth; they reveal robust developmental effects in crucial brain regions serving higher order functions.
Processing of auditory novelty in human cortex during a semantic categorization task
2024, Hearing Research
Auditory semantic novelty – a new meaningful sound in the context of a predictable acoustical environment – can probe neural circuits involved in language processing. Aberrant novelty detection is a feature of many neuropsychiatric disorders. This large-scale human intracranial electrophysiology study examined the spatial distribution of gamma and alpha power and auditory evoked potentials (AEP) associated with responses to unexpected words during performance of semantic categorization tasks. Participants were neurosurgical patients undergoing monitoring for medically intractable epilepsy. Each task included repeatedly presented monosyllabic words from different talkers (“common”) and ten words presented only once (“novel”). Targets were words belonging to a specific semantic category. Novelty effects were defined as differences between neural responses to novel and common words. Novelty increased task difficulty and was associated with augmented gamma, suppressed alpha power, and AEP differences broadly distributed across the cortex. Gamma novelty effect had the highest prevalence in planum temporale, posterior superior temporal gyrus (STG) and pars triangularis of the inferior frontal gyrus; alpha in anterolateral Heschl's gyrus (HG), anterior STG and middle anterior cingulate cortex; AEP in posteromedial HG, lower bank of the superior temporal sulcus, and planum polare. Gamma novelty effect had a higher prevalence in dorsal than ventral auditory-related areas. Novelty effects were more pronounced in the left hemisphere. Better novel target detection was associated with reduced gamma novelty effect within auditory cortex and enhanced gamma effect within prefrontal and sensorimotor cortex. Alpha and AEP novelty effects were generally more prevalent in better performing participants. Multiple areas, including auditory cortex on the superior temporal plane, featured AEP novelty effect within the time frame of P3a and N400 scalp-recorded novelty-related potentials. This work provides a detailed account of auditory novelty in a paradigm that directly examined brain regions associated with semantic processing. Future studies may aid in the development of objective measures to assess the integrity of semantic novelty processing in clinical populations.
Theta oscillatory dynamics serving cognitive control index psychosocial distress in youth
2024, Neurobiology of Stress
Psychosocial distress among youth is a major public health issue characterized by disruptions in cognitive control processing. Using the National Institute of Mental Health's Research Domain Criteria (RDoC) framework, we quantified multidimensional neural oscillatory markers of psychosocial distress serving cognitive control in youth.
The sample consisted of 39 peri-adolescent participants who completed the NIH Toolbox Emotion Battery (NIHTB-EB) and the Eriksen flanker task during magnetoencephalography (MEG). A psychosocial distress index was computed with exploratory factor analysis using assessments from the NIHTB-EB. MEG data were analyzed in the time-frequency domain and peak voxels from oscillatory maps depicting the neural cognitive interference effect were extracted for voxel time series analyses to identify spontaneous and oscillatory aberrations in dynamics serving cognitive control as a function of psychosocial distress. Further, we quantified the relationship between psychosocial distress and dynamic functional connectivity between regions supporting cognitive control.
The continuous psychosocial distress index was strongly associated with validated measures of pediatric psychopathology. Theta-band neural cognitive interference was identified in the left dorsolateral prefrontal cortex (dlPFC) and middle cingulate cortex (MCC). Time series analyses of these regions indicated that greater psychosocial distress was associated with elevated spontaneous activity in both the dlPFC and MCC and blunted theta oscillations in the MCC. Finally, we found that stronger phase coherence between the dlPFC and MCC was associated with greater psychosocial distress.
Greater psychosocial distress was marked by alterations in spontaneous and oscillatory theta activity serving cognitive control, along with hyperconnectivity between the dlPFC and MCC.
Clinical markers of HIV predict redox-regulated neural and behavioral function in the sensorimotor system
2024, Free Radical Biology and Medicine
Even in the modern era of combination antiretroviral therapy, aberrations in motor control remain a predominant symptom contributing to age-related functional dependencies (e.g., neurocognitive impairment) in people with HIV (PWH). While recent evidence implicates aberrant mitochondrial redox environments in the modulation of neural oscillatory activity serving motor control in PWH, the contribution of important clinical and demographic factors on this bioenergetic-neural-behavioral pathway is unknown. Herein, we evaluate the predictive capacity of clinical metrics pertinent to HIV (e.g., CD4 nadir, time with viremia) and age on mitochondrial redox-regulated sensorimotor brain-behavior dynamics in 69 virally-suppressed PWH. We used state-of-the-art systems biology and neuroscience approaches, including Seahorse analyzer of mitochondrial energetics, EPR spectroscopy of intracellular oxidant levels, antioxidant activity assays pertinent to superoxide and hydrogen peroxide (H₂O₂) redox environments, and magnetoencephalographic (MEG) imaging to quantify sensorimotor oscillatory dynamics. Our results demonstrate differential effects of redox systems on the neural dynamics serving motor function in PWH. In addition, measures of immune stability and duration of compromise due to HIV had dissociable effects on this pathway, above and beyond the effects of age alone. Moreover, peripheral measures of antioxidant activity (i.e., superoxide dismutase) fully mediated the relationship between immune stability and current behavioral performance, indicative of persistent oxidative environments serving motor control in the presence of virologic suppression. Taken together, our data suggest that disease-related factors, in particular, are stronger predictors of current redox, neural and behavioral profiles serving motor function, which may serve as effective targets for alleviating HIV-specific alterations in cognitive-motor function in the future.
Disruption of Sensorimotor Cortical Oscillations by Visual Interference Predicts the Altered Motor Performance of Persons with Cerebral Palsy
2024, Neuroscience
Emerging evidence indicates that aberrations in sensorimotor cortical oscillations likely play a key role in uncharacteristic motor actions seen in cerebral palsy. This interpretation is largely centered on the assumption that the aberrant cortical oscillations primarily reflect the motor aspects, with less consideration of possible higher-order cognitive connections. To directly probe this view, we examined the impact of cognitive interference on the sensorimotor cortical oscillations seen in persons with cerebral palsy using magnetoencephalography. Persons with cerebral palsy (N = 26, 9–47 years old) and controls (N = 46, 11–49 years) underwent magnetoencephalographic imaging while completing an arrow-based version of the Eriksen flanker task. Structural equation modeling was used to evaluate the relationship between the extent of interference generated by the flanker task and the strength of the sensorimotor cortical oscillations and motor performance. Our results indicated that the impact of cognitive interference on beta and gamma oscillations moderated the interference effect on reaction times in persons with cerebral palsy, above and beyond that seen in controls. Overall, these findings suggest that alterations in sensorimotor oscillatory activity in those with cerebral palsy at least partly reflects top-down control influences on the motor system. Thus, suppression of distracting stimuli should be a consideration when evaluating altered motor actions in cerebral palsy.

View all citing articles on Scopus

View full text

Computational NeuroscienceThe demodulated band transform

Abstract

Background

New methods

Results

Comparison with existing methods

Conclusion

Introduction

Section snippets

Materials and methods

Adaptive noise removal

Discussion

Acknowledgements

Curr Opin Neurobiol

Int J Psychophysiol

J Neurosci Methods

Int J Psychophysiol

Trends Cogn Sci

Electroencephalogr Clin Neurophysiol

Electroencephalogr Clin Neurophysiol

J Neurosci Methods

Math Biosci

J Neurosci Methods

Neuron

J Neurosci Methods

Signal Process

Neurophysiol Clin

J Neurosci Methods

Electroencephalogr Clin Neurophysiol

Neuron

Exp Neurol

Neuroimage

Biophys J

Electroencephalogr Clin Neurophysiol

Prog Neurobiol

A unified approach to short-time Fourier analysis and synthesis

Proc IEEE

Partial directed coherence: a new concept in neural structure determination

Biol Cybern

Modern techniques of power spectrum estimation

IEEE Trans Audio Electroacoust

Coding of repetitive transients by auditory cortex on Heschl's gyrus

J Neurophysiol

Neuronal oscillations in cortical networks

Science

Coherence and time delay estimation

Proc IEEE

Computational Neuroscience
The demodulated band transform