Large- and multi-scale networks in the rodent brain during novelty exploration

Neural activity is coordinated across multiple spatial and temporal scales, and these patterns of coordination are implicated in both healthy and impaired cognitive operations. However, empirical cross-scale investigations are relatively infrequent, due to limited data availability and to the difficulty of analyzing rich multivariate datasets. Here we applied frequency-resolved multivariate source-separation analyses to characterize a large-scale dataset comprising spiking and local field potential activity recorded simultaneously in three brain regions (prefrontal cortex, parietal cortex, hippocampus) in freely-moving mice. We identified a constellation of multidimensional, inter-regional networks across a range of frequencies (2-200 Hz). These networks were reproducible within animals across different recording sessions, but varied across different animals, suggesting individual variability in network architecture. The theta band (~4-10 Hz) networks had several prominent features, including roughly equal contribution from all regions and strong inter-network synchronization. Overall, these findings demonstrate a multidimensional landscape of large-scale functional activations of cortical networks operating across multiple spatial, spectral, and temporal scales during open-field exploration. Significance statement Neural activity is synchronized over space, time, and frequency. To characterize the dynamics of large-scale networks spanning multiple brain regions, we recorded data from the prefrontal cortex, parietal cortex, and hippocampus in awake behaving mice, and pooled data from spiking activity and local field potentials into one data matrix. Frequency-specific multivariate decomposition methods revealed a cornucopia of neural networks defined by coherent spatiotemporal patterns over time. These findings reveal a rich, dynamic, and multivariate landscape of large-scale neural activity patterns during foraging behavior.


Introduction
Neural activity is coordinated across multiple spatial and temporal scales, ranging from spike-timing correlations across pairs of neurons (Gray et al., 1989) to resting-state fMRI networks (Gusnard et al., 2001) , and from ultra-fast 600 Hz omega oscillations in primary sensory cortex (Timofeev and Bazhenov, 2005) to infra-slow fluctuations linked to 0.05 Hz oscillations in the gastric system (Richter et al., 2017) . Coordinated activity is thought to allow for neural circuits to maximize communication efficiency, multiplex information, flexibly route information flow, and functionally bind cell assemblies (Jensen and Mazaheri, 2010;Singer, 2009;Wang, 2010) .
However, most neuroscience investigations are limited to a single spatial scale, and cross-scale investigations are often limited to univariate or bivariate analyses, e.g., coherence between action potentials from one neuron with the LFP recorded on the same electrode (Whittingstall K Logothetis, 2009) . This mass-univariate approach has been crucial to the development of neuroscience, for example, understanding of computational principles such as neural tuning (Carandini, 2005;Hebart and Baker, 2018;Hubel and Wiesel, 1959) .
However, it may obscure spatiotemporal patterns embedded across populations of neurons within and across brain regions (Cunningham and Yu, 2014;Kriegeskorte and Kievit, 2013;Ritchie et al., 2019;Williamson et al., 2019) .
In contrast, multivariate data analysis methods have proven useful at identifying spatially distributed patterns that reflect lower-dimensional dynamics or that encode sensory representations or memories (Pang et al., 2016) . Furthermore, correlational patterns may provide a "contextual activation" that shapes subsequent local computations (Alishbayli et al., 2019;Cohen and Kohn, 2011;Kohn et al., 2016;Priesemann et al., 2014) .
In the present study, a recently developed set of multivariate methods (generalized eigendecomposition; (Cohen, 2017) enabled us to discover multi-scale, inter-regional functional networks during active behavior by combining data from multiunits and local field potentials. We found a salient, empirical grouping of the networks into a small number of frequency bands (average of 7). In each frequency bank, multiple subnetworks were both simultaneously and independently active. Some networks (e.g., in theta) were spatially distributed across the brain, while other networks (typically in higher frequencies) were more localized to one or two regions. Spiking activity contributed less systematically to brain-wide networks compared to LFP. The analyses revealed both idiosyncratic and reproducible network characteristics within-and across-animals, which suggests that the spatial organization of large-scale networks is subject to individual variability. Overall, our findings reveal a complex landscape of dynamic neural activity that spans multiple spatial, spectral, and temporal scales.

Data acquisition
Six male mice with Bl57/6jbackground (B6;129P2-Pvalbtm1(cr)Arbr/J or Ssttm2.1(cre)Zjh/J) between 4 and 5 months of age, weighing between 27g and 34g, were used in this study. All experiments were approved by the Dutch central commission for animal research (Centrale Commissie Dierproeven) and implemented according to approved work protocols from the local University Medical Centre animal welfare body (approval number 2016-0079).
Inter-electrode distance was 250 μm and typical impedances were between 0.1 and 0.9 MOhm. Electrode design and surgeries are detailed elsewhere ( van Hulten J. A. Cohen M. X, 2020) . A metal reference screw was placed on the skull over the cerebellum. Offline, an average reference was computed for each brain region and subtracted from each electrode in the corresponding region.
Animals were recorded in the sessions depicted in Figure 1b. The recording sessions alternated between their familiar home cage and an unfamiliar location that contained novel objects. In particular, each mouse went through the same succession of six experiment sessions: (1) Home cage recording of 5 minutes. (2) Training phase of 10 minutes, in which the animal was placed in an unfamiliar environment that contained two novel objects. (3) Home cage recording of 5 minutes. One hour then passed (in the home cage) with no recordings. (4) Home cage recording of 5 minutes. (5) Testing phase, in which the animal was returned to the unfamiliar environment that contained one object seen during the training phase and one novel object. (6) Home cage recording of 5 minutes. Mice were connected via electrode fibers to the data acquisition board via a cable that hung from top of the Faraday cage, but were otherwise unrestrained. There was no particular task or objective that was trained, nor were any rewards provided.
Mice tended to explore the objects for brief periods of time (hundreds of ms to seconds), whereas our data analysis approach utilized longer windows for temporal filtering and averaging to ensure high signal-to-noise quality. We therefore focused on possible state changes across the different task sessions as opposed to time-locking to the on/offsets of transient object exploration periods.
LFP data were down-sampled to 1000 Hz. Excessively noisy channels, determined based on visual inspection, were removed (0-4 per recording session; average of 1.2). Independent components analysis was run using the eeglab toolbox (Delorme and Makeig, 2004) and the jade algorithm (joint approximation diagonalization of eigen-matrices), which defines components by maximizing kurtosis (the 4th order statistical moment used to index non-Gaussianity) (Cardoso, 1999) . Components clearly identifiable as non-neural origins were projected out of the data. Data from the first and last 10 seconds of each recording session were excluded from analyses.
Data and MATLAB analysis code will be made publically available upon acceptance via our institute's data repository.

Spike-sorting and multiunit extraction
The raw (30 kHz) voltage recordings were regional-average-referenced to eliminate possible volume-conduction artifacts, and were then filtered between 300 and 6000 Hz using a zero phase-shift FIR1 filter kernel. Spike-sorting was done for each electrode separately given the inter-electrode spacing of 250 μm, which makes it unlikely to observe the same neuron on multiple electrodes. Indeed, we did not find excessive correlations across units from different electrodes (see Figure 1-1 for an example between-unit correlation matrix).
Because our goal here was to obtain information about neural spiking activity as it related to the population and to LFP dynamics -rather than evaluating tuning properties of individual neurons -we chose an automatic spike-sorting approach that separated multiunits from noise or artifacts (Trautmann et al., 2019) . We therefore term these signals "multiunit" to indicate that the resulting time series may reflect a mixture of action potentials from multiple neurons.
Multiunits were extracted via a general purpose spike-sorting suite ( autoSort , available via   our  open  code  repository bitbucket.org/benglitz/controller-dnp/src/master/Access/SpikeSorting), implemented in MATLAB. Briefly, autoSort performs the following sequence of steps to achieve automatic and unbiased sorting of neural signals: • Candidate spike waveforms ('spikes') were detected based on a negative threshold of 4 standard deviations of the background noise (estimated as 1.48 times the median absolute deviation, to avoid artifacts that inflate the standard deviation).
• Candidate spikes were then aligned to their minimum after the trigger and cut out within a window of [-0.7,1.2] ms relative to the alignment time.
• Principal components analysis was performed on a random subset of spikes (N S =5000 per recording) to estimate a projector to a 6 dimensional subspace that retained most of the variance in the data.
• Hierarchical clustering (based on Ward distance) with a set maximal number of clusters (N C =3) was performed on this representation, and all spikes beyond the N S -selection were assigned to these clusters on the basis of their Euclidean distance to the cluster centers.
• Clusters were then post-hoc automatically selected and fused on the basis of the shape and similarity between their average waveforms, i.e. (i) clusters were excluded if they had no significantly positive "hump" after the negative alignment peak, if they had a significantly positive peak before the negative alignment peak, or if the waveform was longer or larger than expected for an extracellular spike; and (ii) clusters were fused if the correlation and Euclidean distance between their average waveforms were above or below preset thresholds, respectively.
These steps and criteria led to an extraction of 0-2 multiunits per electrode. The average rate of spikes per second from all animals and recordings was 13.2 (standard deviation 5.9, minimum 0.07, maximum 51.6). A binary spike time series was constructed for each multiunit, and smoothed with a 30 ms full-width at half-maximum Gaussian to create a continuous signal. This continuous signal was entered into the data matrix as one channel ( Figure 1C).

Frequency-specific components using generalized eigendecomposition
We followed existing procedures for extracting multivariate components that have been detailed and validated in several previous publications (Cohen, 2017;de Cheveigné and Parra, 2014;Nikulin et al., 2011;Tomé, 2006) . A brief overview is provided here.
The goal is to identify a spatial filter that provides a scalar weight for each data channel (LFP and multiunits) such that the weighted sum of narrowband-filtered channel time series is maximally different from the broadband channel time series. The method is based on data covariance matrices because they contain all pairwise linear relationships, making the method multivariate. As described below, two covariance matrices are compared, one matrix ( R ) based on the broadband (non-temporally filtered) data, and one matrix based on the narrowband filtered data ( S ).
Channel-by-channel covariance matrices were created by multiplying the mean-centered data matrices by their transpose. To increase covariance stability, we cut the continuous data into a series of non-overlapping 2 second segments, and computed the covariance matrix of each segment. The even-numbered epochs were used to create the S (signal) covariance matrix and the odd-numbered epochs were used to create the R (reference) covariance matrix. This was done to have non-identical data across the two matrices. After computing covariance matrices for each segment (there were around 70 segments in the home cage sessions and 140 segments in the training/testing sessions), the average covariance matrices S and R were computed across segments. Euclidean distance from each individual covariance matrix to the average was computed (this is equivalent to the Frobenius norm of the matrix difference), and any segments with a distance greater than three standard deviations from the average were excluded, and the final covariance matrix was re-computed without the outliers. On average, 0.85% of covariance matrices were excluded per analysis (range: 0 to 3%).
To create the spatial filter per frequency, we start from maximizing the Rayleigh quotient: ( Where S and R are channel covariance matrices obtained from the narrowband filtered data and the broadband data, respectively ( Figure 1C). One can think of equation 3 as a multivariate signal-to-noise ratio, and the goal is to find a channel vector w that maximizes this ratio. The solution comes from a generalized eigenvalue decomposition on the two matrices. (4) The diagonal matrix contains the eigenvalues, each of which is the ratio of equation 3 for the corresponding column of , which is a matrix in which the columns are the eigenvectors. Thus, we obtain m spatial filters for an m -channel dataset. The solutions are linearly independent from each other, though they are not constrained to be orthogonal as with PCA (this is because eigenvector orthogonality is guaranteed only for symmetric matrices, and R -1 S is non-symmetric). Equation 4 is repeated for a range of temporal frequencies (see below), each using a different S matrix (the covariance matrix created from narrowband filtered data) with the same R matrix.
A small amount of shrinkage regularization (1%) was applied to the R matrix in order to improve the quality of the decomposition (Lotte and Guan, 2011) . In our experience, 1% shrinkage has no appreciable effect on decompositions of clean, full-rank, and easily separable data, and considerably improves the decompositions of noisy or reduced-rank data. In equation 5 below, is the amount of shrinkage (0.01, corresponding to 1%), is the average of all eigenvalues of R , and I is the identity matrix. The entire procedure described above was repeated independently for each animal, experiment session, and filtering frequency. This allowed us to examine the reproducibility of the components both within and across animals.
Data were temporally narrowband filtered by convolution with a Morlet wavelet, defined here as a Gaussian in the frequency domain (Cohen, 2019) . Extracted frequencies ranged from 2 Hz to 200 Hz in 100 logarithmically spaced steps. The full-width at half-maximum of the Gaussian varied from 2 to 5 Hz with increasing frequency. The multiunit channels were not narrowband filtered (they were already smoothed with a 30-ms Gaussian). Any large-scale spike-field coherence patterns would manifest as cross-channel terms in the frequency-specific covariance matrices.
We computed a "region bias score" to determine whether the components were driven by one region or whether all regions contributed to the component. This was quantified as the square root of the average squared eigenvector elements per region. That produces a 3 element vector, which we normalized to sum to 1. The region bias score was defined as the Euclidean distance between this empirical vector and an "ideal shared region" vector of The idea is that if all brain regions have average eigenvector components that are equal in magnitude, then that vector will be close to [1 1 1]/3, and thus the empirical distance to the ideal vector will approach zero. As one or two regions start to dominate the component, the normalized average eigenvector elements vector (e.g., producing an empirical vector of [0.6 0.3 0.1]) will move further away from the ideal vector. The maximum possible distance is 1.
Subspace dimensionality was computed via permutation testing. The ability to derive inferential statistical values is one of the important advantages of generalized eigendecomposition over descriptive decompositions such as PCA or ICA. The idea here was to generate a distribution of maximal eigenvalues that could be expected under the null hypothesis that S and R contain the same information (note from equation 3 that the expected eigenvalue under the null hypothesis is 1, but maximum eigenvalues could be larger due to sampling variability). To generate this empirical null-hypothesis distribution, we randomly assigned each 2-second segment to average into the S or R covariance matrices.
From each iteration, the largest generalized eigenvalue was stored. After 200 iterations for each frequency, the maximum of the largest eigenvalues was taken as the most extreme eigenvalue that can be expected under the null hypothesis that there are no differences between the S and R matrices. The number of actual eigenvalues (from the analysis without shuffling) above this extreme H0 value was taken as the dimensionality of the subspace.
Note that this permutation method accounts for multiple comparisons over M components because it selects the most extreme value of M components on each iteration. Cleaning the covariance matrices via Euclidean distances was performed during permutation testing as described above.
Entropy was computed for each data channel using k=40 bins for discretization.
Finally, within-frequency, inter-component phase synchronization was computed via the weighted phase-lag index (Vinck et al., 2011) , which is a modification of phase synchronization designed to remove any possible artifacts of volume conduction. This was important for our analyses because all networks were derived by different weightings of the same channels, and because the separate components at the same frequency were not constrained to orthogonality.

Data matrices and narrowband source separation
We created channels X time data matrices with 50-80 channels per animal (28-32 LFP channels plus all detected multiunits) (Figure 1), and applied a dimensionality-reduction and guided source-separation method that isolates features of the data that maximally separate narrowband from broadband activity based on generalized eigendecomposition (GED) of covariance matrices (Cohen, 2017) . GED was applied after narrowband filtering the data from 2-200 Hz in 100 logarithmically spaced steps, producing a succession of narrowband components. Each component is a weighted average of channels that maximizes energy at that frequency. There are multiple components per frequency that were sorted according to their eigenvalue, which encodes the separability between the narrowband and broadband energy.   (2) some frequencies (e.g., theta) recruit multi-regional networks whereas other frequencies preferentially engage one or two regions; (3) large-scale networks were dominated by LFP whereas multiunits made relatively little (though significant) contributions; (4) the local regional referencing ensured that the components reflected the coordination of multiple local dipoles (seen as the balance between blue and red colors in the map) instead of long-range volume-conducted fields. The components time series had non-Gaussian distributions, indicative of true signals rather than noise, which is expected to be Gaussian-distributed (Figure 2-1).

Empirically derived frequency bands
Electrophysiology data are often grouped into frequency bands according to integer boundaries (e.g., 4-10 Hz), which may miss, artificially separate, or artificially combine the rhythms naturally occurring in the brain. We therefore applied a recently established method (gedBounds) to derive empirical frequency bands based on the definition of a "frequency band" as a range of frequencies that have highly correlated spatiotemporal dynamics  . GedBounds works by clustering the matrix of squared correlations across the eigenvectors from all frequencies (Figure 3a). It is a purely data-driven alternative to labeling frequencies based on a priori expectations. These results show that grouping electrophysiology time series into spectral bands has an empirical basis and is not arbitrary or an artifact imposed by narrowband filtering. The empirically derived frequency ranges varied over animals and task sessions, and were not systematically affected by the task session. However, we treated frequency as a continuous variable in subsequent analyses rather than grouping into discrete bins.

Component reproducibility
The anatomical targets of the electrode implants were identical in all animals. However, individual variability in functional organization can mean that the GED patterns are idiosyncratic and thus different across animals. Likewise, if the spatiotemporal patterns that GED isolates reflect stable features of the brain, then the patterns should be highly similar in different experiment sessions within the same animal. On the other hand, it is possible that the spatiotemporal patterns are dynamic and are more affected by cognitive factors than by individual differences.
To address questions about component map reliability, we measured map reproducibility, quantified as spatial correlations, both across experiment sessions within each animal, and in the same session across animals. When pooling across all experiment sessions, we observed robust within-animal component topographies (R 2 spatial correlations in the range of 0.4 to 0.8 over the frequency spectrum; see Figure 4a). In contrast, spatial correlations across animals were low, with averaged R 2 values below 0.2. Because the decompositions were performed on the data from each session independently, this pattern of results indicates that (1) the components were stable within each animal over different sessions (over the course of the ~2-hour recording), and that (2) component maps are idiosyncratic, with different spatial patterns in different animals.
The spatial correlations described above were done using only the component with the largest eigenvalue for each session and each frequency. It is possible that the same neurophysiological network was identified as "component 1" in one experiment session and "component 2" in a different session. We therefore modified the correlation analysis to compute the four unique correlations across the top two components from each session/frequency, and stored only the largest correlation coefficient. Although this selection procedure is biased because we selected the strongest correlation out of a set, the same bias was applied within-and across-animals. The correlations were overall stronger, but the conclusion is the same as when correlating only the top components: spatiotemporal patterns were stable within animals, and variable across animals.

Figure 4 . Component topographies are reproducible within animals in different sessions, yet differ across animals. A) R 2 spatial correlations per frequency. The analysis was run on the components with the largest eigenvalue per frequency ("top comp."), and by selecting the largest correlation amongst the top two components ("max"). B) Each individual correlation, separated according to the experiment sessions from which the spatial map pairs were drawn ("T-T" indicates train-test pairs, "H-H" indicates home-home pairs). Black bars indicate the mean R 2 . The color of each dot is the average of the eigenvalues of the component pair (which indicates the separability of the narrowband from broadband signals), and the r-value on top of
each column is the correlation between the spatial map R 2 and the average eigenvalue .
We next assessed whether the maps were modulated by the different experiment sessions by separating R 2 values according to experiment session. The scatter plots in Figure 4B show all frequencies (each dot is an animal-frequency pair), but we averaged frequencies together for the statistics because Figure 4A indicates comparable relationships across the frequency domain. We then tested the correlation coefficients in a one-way ANOVA with the factors train-test, home-home, and train/test-home. In other words, we tested whether the maps were more similar to each other when the animals were in a similar experiment context. However, this effect was not statistically significant (F(2,10)=2.17, p=.16).
Inspection of the distribution of R 2 values in Figure 4b show considerable spread of the correlations, which was only partially resolved by selecting the maximum correlation of the top two components. We suspected that at least some of this variation could be due to the separability of the components from broadband. "Separability" in a GED analysis is quantified as the eigenvalue, which is the multivariate ratio between the narrowband from the broadband covariance matrices along the direction of the eigenvector. We therefore correlated the R 2 values with the average of the eigenvalues of each component-pair. Most correlations between map-similarity and eigenvalue were in the range of 0.1-0.2. Thus, it appears that -to some extent -the narrowband components that are better separated from the background spectrum are more likely to be stable over time.

Region-specificity of components
Given that our data matrices included signals from three brain regions, we next determined whether the components truly reflected inter-regional temporally coherent networks, or whether they were driven by a single region. This was assessed through a regional bias score, in which a score of zero indicates exactly equal contributions from all three regions, whereas a bias score of one indicates that the component is driven entirely by one region with no contributions from the other two regions ( Figure 5A).
The bias scores were mostly between 0.4 and 0.6 within each animal ( Figure 5A), indicating that all three regions contributed to the components to varying degrees. The frequency range that stood out was theta, which exhibited a notable dip in the bias score. Thus, all three brain regions contributed to large-scale networks in the theta range.
This bias score is an aggregate measure; we next investigated the contributions of each region to each frequency, separately for each animal. Figure 5B shows both diversity and commonalities in the regional contributions across the different animals. In these plots, overlapping lines at y=1/3 indicates that all three regions contributed equally to the components, whereas regional dominance is reflected by a separation of lines on the y-axis. Figure 5C illustrates the commonalities across all six animals that are identified through averaging. For example, across animals, PFC generally dominated the low-frequency (<8 Hz) networks whereas the hippocampus generally dominated high-frequency networks between 80-150 Hz.  Contributions of LFP vs. multiunits . We next investigated the relative contribution of spikes and LFPs to the components. This was quantified as modality dominance  , which is the normalized difference between the root-mean-square of the LFP eigenvector elements and the root-mean-square of the multiunit eigenvector elements. A modality dominance value of zero indicates equal contribution of LFP and multiunits, whereas a value of one indicates no contribution of multiunits (a value of minus one would indicate no contribution of LFP channels).
The modality dominance values were close to one for all animals, recording sessions, and frequencies ( Figure 5D). This was not attributable to a difference in signal scaling between LFP and multiunits, because all time series signals were normalized to a mean of zero and a variance of one. However, normalizing to the first and second statistical moments does not preclude the possibility of differences in higher-order statistical characteristics. For example, the LFP channels had overall higher entropy (around 4 bits, averaged over all channels, animals, and experiment sessions) compared to the multiunits (1.7 bits on average) ( Figure   5E).
On the other hand, it was not the case that multiunits made no contributions to the GED-identified networks. We re-ran the source separation for each frequency, excluding all multiunits from the dataset, and computed a t-test at each frequency between the top eigenvalues from the multiunit-including and multiunit-excluding datasets. The difference was statistically significant (correcting for multiple comparisons using the false discovery rate method (Benjamini and Hochberg, 1995) for most frequencies except around 30-90 Hz ( Figure 5F).
Thus, the (Gaussian-smoothed) multiunits made a minor though statistically significant contribution to the matrix decomposition. This overall pattern is not surprising, considering that the LFP samples a larger volume and thus more neurons. On the other hand, there were more multiunit channels in the data matrix than LFP channels, and many of our multiunits may have reflected a combination of several neurons; thus, we interpret this finding to indicate that LFP signals are a richer source of information regarding cross-regional network formation than are action potentials.

Within-frequency component dimensionality
The eigenvectors from the GED analysis carve out a low-dimensional subspace of narrowband activity, and we defined the dimensionality of that subspace as the number of eigenvalues that were larger than a significance threshold based on a null-hypothesis distribution of eigenvalues derived from permutation testing  .
The subspace dimensionality ranged from 2 to 16, and generally increased with higher frequencies ( Figure 6A-B). Higher dimensionality corresponds to the number of statistically separable networks operating at the same frequency. It is noteworthy that there is no pronounced "bump" in the theta range (~4-10 Hz).
Note that this measure is not the total dimensionality of the signal; it is the dimensionality of the subspace that differentiates narrowband from broadband activity. Normalizing these raw numbers to the total dimensionality of the signal (assessed as the rank of the corresponding data covariance matrix) revealed that most narrowband subspaces occupied around 8-10% of the total signal space ( Figure 7C). We investigated the dynamics within these subspaces by computing a volume-conduction-independent measure of phase synchronization (weighted phase lag index) between the top two components for each frequency and task session (note that GED eigenvectors are not constrained to orthogonality as with PCA, and thus within-frequency components can be arbitrarily strongly correlated as long as they remain linearly separable).
Synchronization strength varied between around .2 and .6 depending on the frequency, with strongest synchronization around theta and a smaller departure from the 1/f decay around the beta band ( Figure 6D-F). A repeated-measures ANOVA on session differences in the 7-12 Hz range indicated no main effect of task session (F(5,25)=1.3, p=.29).

Discussion
In this study, we explored multivariate LFP and multiunit data from three brain regions in awake behaving mice using a combination of established and novel multivariate analysis methods to decompose the data into multiple spatial-spectral-temporal modes. We found that these were stable within each animal, but variable across animals. These findings reveal a rich and multidimensional landscape of brain dynamics that highlight the complexity of on-going neural activity.

Feature-guided source separation identifies large-scale narrowband networks
There are several dimension-reduction methods that are regularly applied in neuroscience, including principal and independent components analyses, factor analyses, and Tucker decompositions (Cunningham and Yu, 2014) . It is often unclear which algorithms or which parameters are optimal (Cohen and Gulbinaite, 2014) , and different algorithms can give similar or divergent results (Cohen, 2017;Delorme et al., 2012) depending on their maximization objectives.
GED has several advantages, including that it (1) separates narrowband from broadband activity while holding constant behavioral, cognitive, and other factors; (2) reduces the impact of artifacts or non-brain sources that have a relatively wide frequency distribution; (3) is amenable to inferential statistical thresholding, whereas other decompositions are descriptive and thus selecting components for subsequent interrogation may be subjective or biased; (4) takes into account both spatial and temporal dynamics instead of only spatial or only temporal features; (5) has higher signal-to-noise ratio characteristics and is more accurate at recovering ground truth simulations compared to principal or independent components analyses (Cohen, 2017;de Cheveigné and Parra, 2014;Nikulin et al., 2011;) .
An important finding here is the discovery that a single frequency band can group multiple distinct but spatially overlapping networks. In typical univariate or bivariate analyses, the LFP from a single electrode is treated as an independent statistical unit, based on the implicit assumption that the volume of tissue recorded by an electrode contains only one functional circuit. But a more likely scenario is that each electrode records a mixture of signals from multiple local circuits in the scale of hundreds of microns to a few mm, particularly in the presence of local coherence (Lindén et al., 2011) . Thus, LFP is prone to the same kind of source mixing that affects MEG and EEG (Nunez and Srinivasan, 2006) , though to a lesser extent. This, however, is fortuitous for multichannel recordings, because it means that linear separation methods that have been established in the EEG community are likely to be fruitful in invasive recordings.
The high reproducibility across sessions within each animal (Morrow et al., 2020) , coupled with the low reproducibility across animals, suggests that the large-scale networks that manifest as coordinated LFP dynamics develop in idiosyncratic ways across different individuals. This result, of course, does not invalidate the standard neuroscience approach of targeting the same XYZ coordinates in different individuals and pooling the results together, but our findings highlight that the spatial topographies of larger networks may be unique across individuals, which should be taken into consideration in future studies.

The special role of theta in large-scale network formation
The theta frequency band, typically defined as 4-10 Hz in rodents in 4-8 Hz in humans, is widely implicated in a large range of cognitive processes, including spatial exploration, memory, motor function, and executive functioning. Clearly, there is no simple mapping of frequency band to cognitive process and indeed, even the same brain regions can generate multiple sources of theta independently (López-Madrona et al., 2020; , which may serve different cognitive functions (Mikulovic et al., 2018;Töllner et al., 2017) . In the rodent brain, theta is most robust in the hippocampus, but also synchronizes with independent theta generators in the medial prefrontal cortex (O'Neill et al., 2013;Sigurdsson and Duvarci, 2015) . Intracranial EEG studies in humans have confirmed that theta synchronization is widespread and linked to cognitive operations (Solomon et al., 2017) .
The theta band stood out in many of our analyses, for example by having relatively strong within-frequency, cross-component synchronization (Figure 6), sub-Gaussian kurtosis ( Figure 2-1), and roughly equal contribution from all three regions ( Figure 5). Additionally, theta-band networks appeared to have the most anatomically consistent topographies across animals (see the small peak around theta in Figure 4a). On the other hand, the subspace dimensionality of theta was not higher than other frequencies (Figure 6a-b), suggesting that the theta is important for computational reasons, and is not simply the dominant frequency in general.

LFP vs. multi-unit contributions to large-scale networks
It is perhaps unsurprising that the multiunits made relatively little statistical contribution to the narrowband components, considering that LFP samples a larger volume, has more signal complexity, and can be meaningfully separated into narrow frequency bands. On the other hand, the multiunits were recorded from the same electrodes, added unique information to the narrowband covariance matrices, and improved the overall separability of the narrowband components from broadband across most frequency ranges.
It is possible that LFP carries most of the inter-regional signaling (Yuste, 2015) , considering that LFP reflects a multitude of intra-and extracellular processes (Buzsáki et al., 2012;Reimann et al., 2013) that are modulated by population dynamics of excitatory and inhibitory cells (Mitzdorf, 1985) . It is also possible that spikes carry important information that is spatiotemporally targeted and sparse, and therefore make contributions at a spatial scale smaller than what we investigated. Indeed, the eigendecomposition will prefer larger patterns of covariance over patterns driven by a single data channel. On the other hand, LFP is generally considered a proxy of the local input to a circuit while spikes are considered a proxy of the output of the circuit. Nonetheless, multiunits and LFP are rarely incorporated into the same data matrix as we have done, so their relative contributions should be quantitatively evaluated rather than intuitively inferred.

Implications for novelty and memory
The main network characteristics we identified were not significantly different across the task sessions. This seems to suggest that these network dynamics reflect stable neural architectures as opposed to fluctuating cognitive states.
It is, however, possible that behavior modulates these network dynamics at a faster timescale than experiment sessions. Indeed, neural signatures of novelty processing may be transient, lasting only hundreds of ms (Ranganath and Rainer, 2003) or tens of seconds when first introduced to a novel environment (França et al., 2014) . For example, our camera tracking data (not reported here) revealed that animals tended to explore the objects for brief windows of time -sometimes only a few hundred ms. These windows may have been too brief for sufficient neural network estimation, and due to the novelty of the data analysis methods, we chose to focus on characterizing the neural networks using maximal data to ensure high data quality. This could be explored in future studies by ensuring that a particular behavior is expressed for a longer period of time. Distribution shape via kurtosis . Non-Gaussianity is considered an indicator of an information-rich signal. This comes from the central limit theorem, which leads to the assumption that random noise, and random linear mixtures of signals, will produce Gaussian distributions. We therefore quantified the kurtosis (4th statistical moment of a distribution; the kurtosis of a pure Gaussian distribution is 3) as a measure of the non-Gaussianity of the component time series. We computed kurtosis for the narrowband filtered signal and its amplitude envelope at each component.
Component time series kurtosis was computed as the 4th statistical moment of the component time series. We extracted kurtosis from both the real part of the narrowband signal and the amplitude envelope (extracted via the Hilbert transform). The amplitude