## Abstract

Information theoretic metrics have proven useful in quantifying the relationship between behaviorally relevant parameters and neuronal activity with relatively few assumptions. However, these metrics are typically applied to action potential (AP) recordings and were not designed for the slow timescales and variable amplitudes typical of functional fluorescence recordings (e.g., calcium imaging). The lack of research guidelines on how to apply and interpret these metrics with fluorescence traces means the neuroscience community has yet to realize the power of information theoretic metrics. Here, we used computational methods to create mock AP traces with known amounts of information. From these, we generated fluorescence traces and examined the ability of different information metrics to recover the known information values. We provide guidelines for how to use information metrics when applying them to functional fluorescence and demonstrate their appropriate application to GCaMP6f population recordings from mouse hippocampal neurons imaged during virtual navigation.

## Significance Statement

Functional fluorescence imaging and information theoretic quantification could provide a powerful new combination of tools to study neural correlates of behavior, but functional fluorescence signals represent altered versions of the underlying physiological events. Therefore, it is unclear whether or how information metrics can be applied to functional fluorescence imaging data. Here, we performed an in-depth simulation study to examine the application of the widely used bits per second and bits per action potential (AP) metrics of mutual information (MI) to functional florescence recordings. We provide guidelines for how to use information metrics when applying them to functional fluorescence and demonstrate their appropriate application to GCaMP6f population recordings from mouse hippocampal neurons imaged during virtual navigation.

## Introduction

Neurons encode parameters important for animal behavior, at least in part, through the rate of production of action potentials (APs). Evidence for this can be found from electrophysiological AP recordings of orientation tuning in the visual system (Hubel and Wiesel, 2009), chemical sensing in the olfactory system (Leveteau and MacLeod, 1966; Wachowiak and Shipley, 2006), and spatial encoding in the hippocampus (O’Keefe, 1976). Key to deciphering the neural code, therefore, is defining metrics to quantify the relationship between behavioral parameter spaces and a neuron’s spiking rate. There are many metrics used for quantification, and are often used to compare neural responses across conditions or in neurons with complex responses. The underlying assumptions of the different metrics then become important factors to consider when determining which one to use.

Information theory is growing in popularity in the neuroscience community, largely because it provides a means to quantify rate coding with relatively few assumptions. One useful information theoretic measure is mutual information (MI), which is typically measured in bits per unit time, and describes the increase in predictability of the neural response when behavioral parameters are known. Formally, MI is the information about one variable that can be extracted from another, such as the information about behavior that can be derived from observing neural activity. MI can be applied to neurons with widely varying response properties because it (1) is a nonlinear metric, not requiring the linearity assumptions of correlation metrics (Grubb and Thompson, 2003; Kropff et al., 2015; Hinman et al., 2016); (2) does not assume a response shape, as is typical with Gaussian field mapping metrics (Soo et al., 2011; Kraus et al., 2015; Tang, 2016) or metrics using exponential or polynomial curve fitting (Hinman et al., 2016); and (3) uses the full time trace or shape of the mean response profile, rather than defining receptive fields with thresholding (Niell and Stryker, 2008; Pastalkova et al., 2008; Harvey et al., 2009).

However, MI can be nontrivial to estimate from neural and behavioral recordings and its estimation is an ongoing area of research (Kraskov et al., 2004; Gao et al., 2017; Belghazi et al., 2018; Timme and Lapish, 2018).

Here, we focus on the most widely used estimator of MI in neuroscience, the SMGM estimator developed by Skaggs, McNaughton, Gothard, and Markus (Skaggs et al., 1993), although as a point of comparison, we also consider the binned estimator (Timme and Lapish, 2018) and a separate technique developed by Kraskov, Stogbauer, and Grassberger (KSG; Kraskov et al., 2004). The binned estimator estimates the joint probability distribution using a 2D histogram of neural response versus behavioral variable; this transforms continuous variables into discrete values (Timme and Lapish, 2018). KSG estimates MI by examining the distance between data-points in the neural activity-behavioral parameter space. The SMGM estimator, on the other hand, relies on the assumption that AP firing follows an inhomogeneous Poisson process. The SMGM estimator therefore requires binning of only the behavioral variable(s), in contrast to the binned estimator. The profile of firing rates versus behavioral variable is then used to estimate the MI.

The relative simplicity of the SMGM estimator has added to its popularity and widespread use in neuroscience applications for estimating behavioral information contained in single unit AP recordings. This metric has proven useful in quantifying rate coding in place cells (Knierim et al., 1995; Markus et al., 1995; Lee et al., 2006; Poucet and Sargolini, 2013), complex spatial responses of hippocampal interneurons (Frank et al., 2001; Wilent and Nitz, 2007), odor sequence cells (Allen et al., 2016), time cells (MacDonald et al., 2013), head direction cells (Stackman and Taube, 1998), speed cells (Fyhn et al., 2002), and face differential neurons (Nguyen et al., 2013, 2014), and has been used across multiple different species (Yartsev and Ulanovsky, 2013; Hazama and Tamura, 2019; Mankin et al., 2019). Furthermore, as a single neuron metric, it provides statistical power for comparisons. Thus, it has been used to quantify differences in rate coding across different brain regions (Simonnet and Brecht, 2019) and across experimental interventions such as lesions (Calton et al., 2003; Liu et al., 2004), inactivations (Huang et al., 2009; Koenig et al., 2011; Brandon et al., 2011; Hok et al., 2013), and applications of drugs (Robbe and Buzsáki, 2009; Newman et al., 2014). Further, it has been used to examine differences in encoding across different behaviors (Zinyuk et al., 2000; Park et al., 2011; Aronov and Tank, 2014) and disease states (Zhou et al., 2007; Gerrard et al., 2008; Fu et al., 2017). SMGM information is often normalized from measuring bits per unit time to instead measure bits per AP. This creates a measure sensitive only to the selectivity of a neuron, and not its average firing rate. Thus, SMGM is a powerful tool for measuring the neural code in electrophysiological recordings of APs.

The power of MI estimators has yet to be fully exploited by the neuroscience community. For example, the estimators have not yet been widely used to compare encoding properties of large numbers of genetically identified neurons, or to quantify information content of other discrete signaling events such as synaptic inputs; both of which are difficult to study using electrophysiological methods. *In vivo* imaging of functional indicators has emerged as an important tool, largely because it possesses these capabilities. For example, using fluorescent calcium indicators, the functional properties of large populations of neurons can be simultaneously recorded in rodents (Dombeck et al., 2007; Ziv et al., 2013; Stirman et al., 2016; Sheffield et al., 2017; Radvansky and Dombeck, 2018; Stringer et al., 2019), zebrafish (Ahrens et al., 2013), or invertebrates such as *Caenorhabditis elegans* (Nguyen et al., 2016) and *Drosophila* (Keller and Ahrens, 2015; Mann et al., 2017). Furthermore, *in vivo* imaging can assure the genetic identity of the recorded neurons (Khoshkhoo et al., 2017; Sheffield et al., 2017; Jing et al., 2018a,b,c) and can access subcellular structures, allowing for functional recordings from synapses and dendrites using different functional fluorescent indicators (Sheffield and Dombeck, 2015; Scholl et al., 2017; Sheffield et al., 2017; Jing et al., 2018d; Marvin et al., 2018, 2019; Adoff et al., 2021).

However, these indicators generate signals that are different from the underlying quantal events. For example, somatic calcium indicators reveal intensity variations that are correlated with somatic AP firing rates but are a smoothed and varying amplitude version of the AP train. This transformation from AP train to fluorescence trace is an active area of research (Dana et al., 2018; Greenberg et al., 2018; Éltes et al., 2019), but it is often approximated by convolving the AP train with a kernel, which defines the indicator’s response to a single AP. The shape of the kernel is a function of the indicator expression level, intracellular calcium buffering, amount of calcium influx, efflux rates, background fluorescence, resting calcium concentration, and other factors. When measured in pyramidal neurons, average kernels typically take the shape of a sharp increase in fluorescence followed by an exponential decay to baseline (Yaksi and Friedrich, 2006; Chen et al., 2013; Park et al., 2013; Dana et al., 2018; Pachitariu et al., 2018). Therefore, while functional fluorescence imaging and information theoretic quantification may prove to be a powerful new combination of tools to study neural correlates of behavior, it is critical to remember that functional fluorescence signals represent altered versions of the underlying physiological events.

Caution is then needed when applying information metrics to continuous functional fluorescence traces, yet the imaging community is already beginning to use information metrics, particularly SMGM. This metric has been applied to somatic calcium responses to compare the information content of the same neurons across different behavioral epochs (Heys and Dombeck, 2018), across different populations of neurons in different brain regions (Hainmueller and Bartos, 2018), across different genetically identified neural populations (Khoshkhoo et al., 2017), or to examine encoding by subcellular structures (Rashid et al., 2020), or to classify the significance of encoding particular parameters by individual neurons (Kinsky et al., 2018; Mau et al., 2018; Rashid et al., 2020).

However, it is essential to recognize some of the assumptions underlying these information metrics are violated by functional florescence recordings. All three metrics (SMGM, KSG, and binned estimation) assume stationarity in the neural response, which is violated by the elongated time responses and relatively slow fluctuations of the fluorescence intensity of the reporters. When applied to spiking data, there is also a change in units: rather than AP counts, functional fluorescence traces are typically plotted in units of florescence change with respect to baseline (ΔF/F). One possible solution to these issues would be to deconvolve calcium traces to recover APs; however, deconvolution is an active area of research, and the accuracy of these methods has recently been questioned (Evans et al., 2019). Ideally, the calcium traces could be used directly to measure spiking information, without the need for such an in between, potentially error inducing, step.

Quantifying the effects of the above violations on measurements of information using functional fluorescence recordings with an analytical solution is particularly challenging with behaviorally modulated neural recording data. However, a more tractable means of quantifying the effects would be to use a simulation study to measure the induced biases and changes in measurement quality (Morris et al., 2019). This strategy makes use of pseudo-randomly generated AP traces and has the advantage that the ground truth parameters of the simulations are known, while variability because of behavior and other features can be incorporated (Cohen and Kohn, 2011; Climer et al., 2013, 2015; Østergaard et al., 2018).

To provide the field with guidelines for the use of information metrics applied to functional fluorescence recording data, we used computational simulation methods to create a library of ten thousand mock neurons whose spiking output carry an exact, known (ground-truth) amount of information about the animal’s spatial location in its environment. We used real behavioral data (available at https://doi.org/10.7910/DVN/SCQYKR) of spatial position over time from mice navigating in virtual linear tracks and then simulated the spatial firing patterns of the mock neurons using an inhomogeneous Poisson process framework (Brown et al., 2003; Paninski, 2004; Climer et al., 2013). We then simulated fluorescent calcium responses for each neuron in each session by convolving the AP trains with calcium kernels for different indicators, primarily GCamp6f (Chen et al., 2013), and then we added noise. MI metrics (between spatial location and the neural signals) were then applied to the spiking or fluorescence traces to quantify the performance of the metrics for estimating information. We provide a user toolbox (found at https://github.com/DombeckLab/infoTheory), which consists of MATLAB functions to generate libraries of model neurons with known amounts of information, to generate spiking or fluorescence time-series from those model neurons, and to estimate neuron information from real or model spiking or fluorescence time-series datasets using the three metrics considered here (SMGM, binned estimator, KSG). We focused on testing the performance of the SMGM method, and then compared its performance to the binned estimation and KSG methods, which do not have the underlying Poisson assumption required for the SMGM approach. We also applied a deconvolution algorithm to test its performance. We then applied this analysis to real datasets of hippocampal neuron populations from mice navigating in virtual linear tracks. We quantified the spatial information content of the populations and then performed Bayesian decoding of mouse position from different information containing subsets of this population. Interestingly, we found that the population quantile with the lowest information values were still able to decode mouse position to the closest quarter of the track. Thus, we provide new findings about the neural code for space that were made possible by the information metrics and guidelines that we introduce here.

The SMGM method applied directly to the mean ΔF/F intensity map appeared to best recover the ground truth information. We provide guidelines for the use of the SMGM metric when applied to functional fluorescence recordings and demonstrate the appropriate application of these guidelines to GCaMP6f population recordings from hippocampal neurons in mice navigating virtual linear tracks.

## Materials and Methods

### Toolbox and data availability

We provide a user toolbox (freely available at https://github.com/DombeckLab/infoTheory), which consists of MATLAB functions to generate libraries of model neurons with known amounts of information, to generate spiking or fluorescence time-series from those model neurons, and to estimate neuron information from real or model spiking or fluorescence time-series datasets using the three metrics considered here (SMGM, binned estimator, KSG). This toolbox also contains tools to generate mock neurons using a binned distribution, avoiding the Poisson assumption of SMGM. Behavioral data used to generate the random traces is freely available at https://doi.org/10.7910/DVN/SCQYKR.

### Construction of AP trains with known ground truth information

To construct mock neurons with ground truth information, we adapted the differential form of the AP information, in bits per AP (Eq. 6). To create a rate map, we first selected an average firing rate and target ground truth information. The mean rate (*C*) was spread acceptably for further analysis.

### Extended Data Figure 2-1

Effect of recording density on information metrics. ** A**, The mean error (top) and absolute error (bottom) between the ground truth information (

**, As**

*B***, but as a function of number of laps.**

*A***,**

*C***, As**

*D***,**

*A***, but for measurements from mock fluorescence traces.**

*B***,**

*E***, As**

*F***,**

*A***, but for the SMGM bits per AP metric.**

*B***,**

*G***, As**

*H***,**

*C***, but for the SMGM bits per AP metric. Download Figure 2-1, EPS file.**

*D*### Extended Data Figure 2-2

Effects of applying the SMGM bits per second metric to fluorescence traces from different common functional indicators. Top, Information measured from mock traces using the SMGM bits per second metric (

### Extended Data Figure 3-1

Effects of applying the SMGM bits per AP metric to fluorescence traces from different common functional indicators. Top, Information measured from mock traces using the SMGM bits per AP metric (

### Extended Data Figure 3-2

Effect of number of bins on the SMGM metrics. ** A**, The mean percentage error (top) and the SD (bottom) for the bits per second measure (

*x*-axis and 2–60 bins (3-m track) on the

*y*-axis.

**, As**

*B***, for the bits per AP measure (**

*A***, The mean percentage error (top) and the SD (bottom) for the bits per second measure (**

*C**x*-axis, number of bins is on the

*y*-axis.

**, As**

*D***, but for the bits per AP measure (**

*C*### Extended Data Figure 3-3

Effects of a sigmoid nonlinearity between ** A**, Nonlinearity applied to AP-to-florescence trace transformation.

**, Information measured from AP data using the SMGM bits per second (**

*B***, Percentage error for the information measurements shown in**

*C***.**

*B***, Heat map of percentage error measurements shown in**

*D***. Black lines are 2 SDs, the white line is the mean.**

*C***, As**

*E–G***, but for the bits per AP measured (**

*B–D*### Extended Data Figure 4-1

Effect of changing the regularization coefficient in deconvolution (Vogelstein et al., 2010; Friedrich et al., 2017) on the measured information. ** A**, The measured information using the fluorescence bits per second metric applied to 10,000 mock GCaMP6f traces using regularization coefficients between 0 and 3.

**, As**

*B***, but for the fluorescence bits per AP metric. Download Figure 4-1, EPS file.**

*A*### Extended Data Figure 4-2

The binned estimator applied to AP traces and then compared to ground truth information. (Top) Information measured from mock AP traces vs ground truth information. The gray line is the unity line, the pink line is the best fit saturating exponential. (Middle) Percentage error for the information measurements shown on top (same scale as shown in Figure 4). (Bottom) Heat map of percentage error measurements shown in middle. (A) The Binned Estimator applied to AP traces using uniform bins. (B) The Binned Estimator applied to AP traces using equal occupancy bins. Download Figure 4-2, EPS file.

### Extended Data Figure 4-3

Information for neurons with Gaussian rate maps. ** A**, blue, Information measured from mock GCaMP6f traces using the SMGM bits per second metric (

**, Percentage error for the information measurements shown in**

*B***.**

*A***, blue, Information measured from mock GCaMP6f traces using the SMGM bits per AP metric (**

*C***, Percentage error for the information measurements shown in**

*D***. Download Figure 4-3, EPS file.**

*C*### Extended Data Figure 5-1

The SMGM estimators as applied to real AP data from a real spiking dataset from hippocampal neurons in mice running on a behavioral track (Chen et al., 2016; Grosmark and Buzsaki, 2016; Grossmark et al., 2016). ** A**, Example real place cell. From top to bottom, Rat track position versus time, real AP raster, mock fluorescence calcium trace generated from real AP trace by convolving APs with GCaMP6f kernel and adding noise (green), and firing rate map (

**, Plot of**

*B***, The SMGM bits per second metric applied to the real AP traces (**

*C***, The percentage difference between the SMGM bits per second metric applied to the real AP traces (**

*D***, Density plot for the data shown in**

*E***.**

*D***, As**

*F–H***, but for the bits per AP metric (**

*C–E*The rate maps were constructed by spline interpolating across five control points with two anchored at each end of the track, and taking the exponential for each point, and then normalizing by the numerically calculated integral (Fig. 1*A*,*D*). To create a map matching the target information, we began with a random spline. The “y” (relative rate) initial position of each node was chosen from a standard normal distribution and the initial “x” (track position) of the three center nodes was chosen uniformly. The nodes were then systematically moved using the MATLAB built in optimizer ‘fmincon’ with constraints preventing the crossing of the center nodes and keeping them on the track, and the ‘OptimalityTolerance’ option set to 0 (Fig. 1*A*). This was accomplished using the ‘genExpSpline’ function in the toolbox.

We then randomly selected behavioral traces (see below, Behavior) and concatenated sessions until a total time randomly chosen between 3 and 60 min was reached (Figs. 1*E*,*F*, 2*A*, 3*A*). This was accomplished using the ‘loadBehaviorT’ function in the toolbox. The track positions were normalized and used to build a conditional intensity function (CIF) from the rate function above. The CIF was normalized to match an expected mean rate over the entire session, and the MATLAB built-in ‘poissrnd’ function was used to generate AP times, sampled at 1 kHz. The was accomplished using the ‘genSpikeTrain’ function in the toolbox. Finally, the AP times were binned according to the counts within mock imaging frames sampled at 30 Hz.

### Simulated Δ F F
traces

To construct the *E*,*J*,*K*, 2*A*, 3*A*), we first created a single AP response kernel from the peak-normalized sum of two exponentials:

where
t is the time since the AP and
a and
b are chosen to minimize

The GCaMP6f, GCaMP6s, and jRGECO1a heights, rise and fall times were measured as responses to single APs *in vivo* (Kalko et al., 2011; Chen et al., 2013; Dana et al., 2019): other kernels (Fig. 2*H*; Extended Data Figs. 2-2, 3-1) were approximated from other experiments presented in the references (seen in Table 1).

To define the width of the kernel (Figs. 2*L–N*, 3*K–M*), we considered the kernel as a low pass filtered version of the APs. If we normalize the filter to mean 1, it has the Fourier transform

White noise with a SD of 0.15

### Nonlinearity

In our linear simulations used throughout this work, the fluorescence kernels associated with a fast sequence of APs were approximated to sum linearly. In real cultured neurons, a summation nonlinearity has been observed such that sequences of APs do not generate a linear summation in
ΔF/F (Dana et al., 2019). To simulate this nonlinearity, the

This equation was arrived at by fitting the measured responses in Dana et al. (2019; their Fig. 2*C*), which can be compared with the nonlinearity used here (Extended Data Fig. 3-3*A*).

### Deconvolution

Deconvolution was performed using the previously described FOOPSI algorithm (Vogelstein et al., 2010; Friedrich et al., 2017). The regularization coefficient was set at 0.02154, which maximized the correlation between the deconvolved trace and the true spike train in a random sample of 500 simulated traces: all other parameters were optimized for each trace. Because the example regularization coefficient provided by Friedrich et al., 2017 was 2.4, we also measured information values at 100 different values for the regularization coefficient between 0 and 3; this had little effect on the measured information (Extended Data Fig. 4-1).

### KSG estimator

The previously described second KSG estimator (Kraskov et al., 2004) was used using the fifth nearest neighbor distance.

### Binned estimators

The binned MI estimators were used (Timme and Lapish, 2018). The activity trace was divided into 10 bins, either evenly across the span of the activity (uniform binned) or variably so the bins contained the same number of samples (occupancy binned). Position was similarly divided into 60 bins.

### Gaussian simulations

To compare the analytic approximation to our numerical method, the numerical techniques had to be applied to place cells with Gaussian rate maps. The same target information, firing rates, and behavior were used as for our original 10,000 simulations with spline rate maps. However, instead the rate map was chosen as a Gaussian with width

### Bayesian decoding

The Bayesian decoder used here (Fig. 5*G*,*H*) was adapted from a previously described method (Zhang et al., 1998). Decoding was performed on the likelihood that a significant transient occurred in a time frame, trained on the first 80% of the session and tested on the last 20%. The session was divided into

### Animals

Ten- to 12-week-old male C57BL/6 mice (20–30 g) were individually housed under a reverse 12/12 h light/dark cycle, all experiments were conducted during the dark phase. All experiments were approved by the Northwestern University Animal Care and Use committee.

### Behavior

We used a previously described virtual reality set-up and task (Heys et al., 2014; Sheffield and Dombeck, 2015; Sheffield et al., 2017), some of the behavior sessions used here has previously appeared in these studies. Briefly, water scheduled, head fixed mice were trained to run on a cylindrical treadmill down a 3-m virtual track to receive a water (4 μl) reward at the end of the track, and were subsequently teleported to the beginning of the track after a 1.5-s delay. Behavioral sessions were included if the animal ran at least 20 laps containing a continuous 40-cm run for which the velocity was over 7 cm/s during a 5- to 30-min session.

### Mouse surgery and virus injected

We performed population calcium imaging of CA1 neurons as described previously (Sheffield and Dombeck, 2015; Sheffield et al., 2017). Briefly, 30 nl of AAV1-SynFCaMP6f (University of Pennsylvania Vector Core, 1.5 × 10^{13} GC/ml) was injected through a small craniotomy over the right hippocampus (1.8 mm lateral, 2.3 mm caudal of bregma; 1.25 mm below the surface of the brain) under isoflurane (1–2%) anesthesia. 7 d later, a hippocampal window and head plate was implanted as described previously (Dombeck et al., 2010).

### Two-photon imaging

Imaging was performed as previously described (Sheffield and Dombeck, 2015; Sheffield et al., 2017). Scanimage four was used for microscope control and acquisition (Pologruto et al., 2003). Time series movies 1024 or 512 × 256 pixels) were acquired at 50 Hz. A Digidata1440A (Molecular Devices) with Clampex 10.3 synchronized position on the linear track, reward timing, and the timing of image frames.

### Image processing, region of interest (ROI) selection, and calcium transient analysis

Images were processed as previously described (Sheffield and Dombeck, 2015; Sheffield et al., 2017), with minor modifications. Briefly, rigid motion correction was performed using cross-correlation as in (Dombeck et al., 2010; Miri et al., 2011; Sheffield and Dombeck, 2015), but here using a fast Fourier transform approximation on the full video. ROIs were defined as previously described (Mukamel et al., 2009; μ = 0.6, 150 principal/independent components, SD threshold = 2.5, SD smoothing width = 1, area limits = 100–1200 pixels).

### Behavior analysis

The mean virtual track velocity was defined as the total virtual track distance covered during the session divided by the total duration of the session; slow and stop periods were included in this metric. All other analyses were restricted to long running periods, where the animal exceeded a virtual track velocity of 4 cm/s and ran continuously for at least 40 cm.

### Defining place fields

Place fields were defined by first creating the spatial fluorescence intensity map (

## Results

### The SMGM information metrics

Here, we review the derivation of the SMGM information metrics and the underlying assumptions. For illustrative purposes throughout this manuscript, we use the example of spatial encoding in which the firing pattern of neurons carry information about the animal’s location along a linear track; however, the derivations, equations and conclusions generalize to encoded variables over other domains and dimensionalities.

Consider a random variable
X representing the positions an animal might take, with
x being its value measured at one time sample. The positions are subdivided into
N spatial bins, such that
x can take on the values

where

In the derivation of these metrics, there are two key assumptions that are violated by functional fluorescence recordings. First, the recordings do not follow Poisson statistics: instead of discrete counts of APs (
y), the functional fluorescence traces consists of a continuous relative change in fluorescence (

### Building a ground truth library of 10,000 neurons with known values of information

To create a neuron with a known, ground truth information value, it was necessary to generate a continuous (i.e., infinitesimally small bins) rate map (*A*) to build a starting continuous map of the normalized instantaneous firing rate, *A*,*B*), in the end resulting in a mean error of 5.1 × 10^{−9} bits/AP and a mean absolute error of 1.5 × 10^{−7} bits/AP. The rate map at this convergence point was used for further analysis. This procedure was repeated to generate 10,000 mock neurons with a range of (known and ground truth) information values. Note that the value in Equation 5 cannot be higher than when all the APs arrive in one spatial bin; the rate in that bin is *C*). We chose a mean firing rate *C*). Example low (*D*.

These rate maps provided a basis for generating mock AP firing data (and functional fluorescence data, see below). Under real experimental conditions, recording duration and bin sizes are finite and animal occupancy maps (*E1*). This behavior, the average firing rate (*E2*), sampled at 1 kHz, from which AP times were generated assuming Poisson firing statistics (Fig. 1*E3*). An example mock of spiking in response to behavior for low (0.04 bits/AP) and mid (2 bits/AP) information neurons can be seen in Figure 1*F–H*. From these spiking responses, we then generated mock fluorescence traces by convolving the raster with a double-exponential kernel matching the rise and fall times for GCaMP6f (Chen et al., 2013; Fig. 1*E4*) and adding random Gaussian noise to model shot noise. Mock fluorescence traces for the two example neurons in Figure 1*F–H* can be seen in Figure 1*I*,*J*. The mock AP and fluorescence traces were used to create session mean spatial maps, of binned firing rate (*K*). By repeating this process, we built a large dataset of spiking and fluorescence traces, generated from our library of mock neurons with known amounts of information and using real animal spatial behavior. With tens of thousands of these mock neuron recordings, we could then assess the effects of many simulation parameters on the information values determined from the metrics including firing rate, session duration, fluorescence kernel shape, and ground truth information value.

### Quantification of the accuracy and precision of the SMGM bits per second metric using functional fluorescence recordings

We first applied the SMGM bits per second metric (*A* shows three mock neurons with ground truth information values of *B–D*), as a linear fit (*p* = 4.6 × 10^{−6} bits per second and *p* ≪ 0.01) explained nearly all the variance (*R*^{2} = 0.97), the average error was *A*,*B*) which has been previously well characterized (Treves and Panzeri, 1995), with average errors exceeding +10% for <6 min of recording, mean rate under 0.6 Hz, and under 11 trials. Thus, the SMGM bits per second metric (

We next discuss the changes to the SMGM bits per second metric (*E4*). If we discount the latter for a moment and focus on the scaling,

We applied a GCaMP6f modeled kernel to the 10,000 mock AP traces to generate 10,000 mock fluorescence calcium traces. Figure 2*A* shows the fluorescence traces generated from three mock neurons with ground truth information values of *E–G*), as there was a clear scaling of the ground truth information and a consistent underestimation with a mean error of *p* = 0.07). The slope of this fit was *p* ≪ 0.01), which provides a measure of the scaling factor (
c). This error was not corrected for with denser sampling: it remained consistent even at high firing rates and many trials (Extended Data Fig. 2-1*C*,*D*). In addition to this scaling effect caused by
c, smoothing of the rate map could induce nonlinearity in the relationship between and *E* with a saturating exponential and compared the fits using a likelihood ratio test: the exponential did not significantly improve the fit (*p* = 0.76), which indicates that smoothing by the kernel does not induce significant nonlinearities.
c is dependent on the height and width (the integral) of the kernel and was measured here as 0.039
± 12e-4 *E*) would be easy to correct for assuming the
c factor, and therefore the kernel, were similar across all measured neurons. This point is considered further below in the Guidelines for application of information metrics to functional fluorescence imaging data. We conclude that ground truth information, as measured by the fluorescence SMGM bits per second metric (

The amplitude (height) of the change in fluorescence can vary across indicators and conditions. The height of the kernel, given a constant kernel width, should linearly scale
c and the error in estimating information with *I–K*), but that maintain the same shape and width (from the GCaMP6f kernel), and then measured the percent error in estimating information with *E–G*), the percent error in estimating information with *I–K*). However, as a function of the height of the kernel, the percent error (averaged over all ground truth information values) in estimating information with *p* ≪ 0.01, slope 20.7
± 0.14%/*p* ≪ 0.01, *I*,*J*). Over the wide array of available functional fluorescent indicators in use today (Fig. 2*H*), this leads to differences in error because of differences in transient height of the indicator used alone. For the indicators shown in Figure 2*H*, there is an average height of 0.603
± 0.10 SD

The width of the kernel can vary widely across fluorescent indicators (Fig. 2*H*), with “faster” indicators boasting shorter rise and fall times. The combined effect of a longer rise and fall time is to smooth and delay the AP train; in other words, it acts as a causal low-pass filter. The cutoff period of this low pass filter provides a measurement of the effective width of the kernel (see Materials and Methods). The effect of such differences in kernel shape on the error in estimating information with *E–K*), the percent error in estimating information with *N*). Interestingly, the percent error (averaged over all ground truth information values) in estimating information with *L*,*M*). The error increases up to a kernel width of ∼3 s, at which point it saturates at approximately −85% error. This arises from an interaction between changing the average value of the original AP trace and flattening the average fluorescence map (*B–G*. The resulting distributions, estimated
c values, and mean and absolute errors can be seen in Extended Data Figure 2-2. In summary, we conclude that information, as measured by the fluorescence SMGM bits per second metric (

### Quantification of the accuracy and precision of the SMGM bits per AP metric using functional fluorescence recordings

The SMGM metric is commonly normalized by the mean rate to obtain a measurement in units of bits per AP. We thus applied the SMGM bits per AP metric (*A* shows three mock neurons with ground truth information values *B–D*), as a linear fit (*p* = 2.8e-184 bits per second and slope = 0.93
± 0.0010, slope *p* ≪ 0.01) explained nearly all the variance (*R*^{2} = 0.99), the average error was −0.071
± 0.23 bits/AP (3.2
± 5.9% error) and the absolute error was 0.13
± 0.21 bits per second (8.1
± 9% error). However, the data were better fit with a saturating exponential (*p* ≪ 0.01) converging to 5.8 bits/AP as it approached the limit because of the finite bin count. There is a substantial positive bias for the lowest firing rates and smallest number of trials (Extended Data Fig. 2-1*E*,*F*) which has been previously well characterized (Treves and Panzeri, 1995). Thus, the SMGM bits per-AP metric (

We next discuss the changes needed to apply the SMGM bits per AP metric (

As discussed above, the fluorescence map (

We then applied the fluorescence SMGM bits per AP metric (*A* shows the fluorescence traces generated from three mock neurons with ground truth information of *E–G*). At low information values, there was little bias, but at higher information values the information recovered was substantially lower than the ground truth information. The mean resulting error was −0.38
± 0.58 bits/AP (−9.7
± 27.8%) and absolute error of 0.39 (12.9
± 26.4%). This error was better fit with a saturating exponential than a linear fit (*p* ≪ 0.01), with the average error <5% up to ground truth information of 1.8 bits/AP and <10% up to 3.0 bits/AP. At ground truth information values higher than 3 bits/AP, the average error was −1.06
± 0.595 (−22.5
± 9.44%) and absolute error was 1.07
± 0.589 bits/AP (22.6
± 9.21%). This error persisted even with denser sampling: it remained consistent even at high firing rates and many trials (Extended Data Fig. 2-1*E*,*F*). Thus, the indicator induces relatively little error at lower information values (<3 bits/AP), but the smoothing effect of the kernel induces a nonlinear, negative bias to the estimator, particularly at ground truth information values over 3 bits/AP.

Although the height of the kernel can vary between different functional fluorescence indicators (Fig. 2*H*), these height variations linearly scale the fluorescence map. Thus, since *H–J*). Unlike for the SMGM bits per second metric, the percent error (averaged over all ground truth information values) in estimating information with *p* = 0.43), but a nonlinear dependence on ground truth information as in Figure 3*E–G*, with no significant difference in the parameters of the saturating exponential fit (*p* = 0.43). Thus, as expected, the percent error in estimating information with the SMGM bits per AP metric (

With little effect of kernel height on *K–M*). Similarly, as observed above for GCaMP6f and the varying kernel height examples (Fig. 3*E–J*), the percent error in estimating information with *E–G*). The percent error (averaged over all ground truth information values) showed a nonlinear response as a function of the width of the kernel (Fig. 3*K*,*L*), with a steep increase in error for kernel widths >∼1 s. Even for kernel widths <∼1 s, the percent error was strongly dependent on the ground truth information value, with steep increases in error for values more than ∼2.5–3 bits/AP (Fig. 3*M*). Thus, as the kernels gets wider, there is more negative bias at lower and lower information measured. The resulting errors are thus larger for wider kernel indicators, for example, with a kernel width the same as gCaMP6s (2.54s), the error exceeds −17% even at low (<0.25 bits/AP) information, with average errors of −0.86
± 1.0 bits/AP (−31
± 19% error) and absolute errors of 0.87
± 1.0 bits/AP (32.6
± 16% error). In contrast, with a kernel width the same as iGluSnfR (0.52 s), the average error exceeded 5% at 3 bits/AP and 10% at 3.7 bits/AP with a mean error of *H*, taking into account differences in *both* height and duration, we used the five kernels to generate mock fluorescence traces from the 10k neurons in Figure 3*B–G*. The resulting distributions, mean and absolute errors, and error thresholds can be seen in Extended Data Figure 3-1.

Since the known information values in our library of 10,000 mock neurons were determined using the SMGM metric, which includes the assumption that neuron firing follows an inhomogeneous Poisson process, we next investigated whether the biases observed between AP and fluorescence metrics (

In summary, we conclude that ground truth information, as measured by the fluorescence SMGM bits per AP metric (

### Nonlinearity introduces further biases

The results presented in the previous two sections rely on the approximation that *A*) to the 10,000 mock GCaMP6f time-series traces described above, based on the real behavior of GCaMP6f in cultured neurons (Dana et al., 2019; see Materials and Methods). While the resulting measurements (Extended Data Fig. 3-3) of ground truth information, as measured by the fluorescence SMGM metrics, are largely consistent with the results observed when using the linear assumption (Figs. 2, 3), some quantitative difference can be seen. Thus, even a relatively simple nonlinearity between

### Deconvolution may not be sufficient to eliminate biases

The framework presented here for comparing ground truth information with information measured with the SMGM metrics can be extended to test the efficacy of other strategies for extracting MI. In particular, a perfect AP inference method would alleviate the problems associated with applying the SMGM metrics to functional fluorescence recordings. To test the utility of such a strategy in measuring information, we applied a popular deconvolution algorithm, FOOPSI (Vogelstein et al., 2010; see Materials and Methods), to the same 10,000 mock GCaMP6f time-series traces described above. Importantly, this deconvolution algorithm (and other available algorithms) does not recover traces of relative spike probability or exact spikes times, but instead produces sparse traces with arbitrary units, that have non-zero values estimating the relative “intensity” of spike production over time ( d). This signal can be thought of as a scaled estimate of the number of spikes per time bin, and thus the average intensity map will have some similar properties to the florescence intensity maps, that is, we would expect the intensity maps from deconvolution to approximate the relative firing rate scaled by some factor c , which has arbitrary units.

We then measured information in these deconvolved
d-traces using the SMGM metrics (*A*), we found a clear scaling of the ground truth information. The scaling factor was very small (*E*; 96.0% absolute error, rank-sum *p* ≪ 0.01; *c* = 0.0390). It is worth noting that the deconvolved trace
d can be arbitrarily scaled, so in a sense this error is arbitrary. However, these are the results from the scaling chosen by a widely used deconvolution algorithm and the large error emphasize that the scale of
d can have a large effect on the bits per second measure (

Assuming that the intensity map of the deconvolved
d-traces are a scaled version of the true rate maps, we could measure information using the SMGM bits per AP metric *B*). Compared with the SMGM bits per AP metric applied to florescence (*R*^{2} = 0.93). However, information measured with *p* ∼ 0) converging to a saturation value of 5.51 bits/AP (compared with 5.78 for *p* ≪ 0.01]. Thus, when comparing the recovery of ground truth information from functional fluorescence traces using either direct application of the SMGM metrics (

### The KSG and binned estimators are poor estimators of MI in functional florescence data

In addition to SMGM, the KSG and binned estimation metrics have been developed for estimating MI between variables. These other two metrics produce information measured in bits per second, so they are only comparable to the SMGM bits per second estimator (

We applied the KSG, binned estimator (uniform bins), and binned estimator (occupancy binned; Materials and Methods) to the same 10,000 mock GCaMP6f time-series traces and behavioral data used to assess the SMGM approach. These methods all behaved similarly when applied to our simulations (Fig. 4*C–E*), so they will be discussed together here. The information values measured by these techniques correlated with ground truth information in bits per second (*p* ∼ 0). The KSG and binned estimator methods overestimated the information at lower ground truth (*F*). This is in comparison to the 0.35
± 0.59 *D*,*E*) were caused when the estimators were applied to fluorescence data (rather than simply a difference between the binned estimator, which do not rely on a Poisson firing assumption, and the ground truth information established using SMGM, which does rely on a Poisson firing assumption). We found the errors when applying the binned estimators to AP traces were relatively small [mean absolute error 2.72 ± 3.38 bits per second (41.1%) and 2.70 ± 3.33 bits per second (41.0%) for the uniform and occupancy-based binning; respectively; Extended Data Fig. 4-2]. Therefore, when comparing the recovery of ground truth information from functional fluorescence traces using either the SMGM metric

### An analytic approximation can reproduce some qualitative, but not quantitative, results of the numeric solutions

Some of the general features of the relationship between ground truth information and fluorescence SMGM metrics can also be seen using an analytic approximation. For example, if we approximate the rate map as a Gaussian firing field with mean rate

Similar to the numerical solution, the analytic approximation provided by these equations (Eqs. 9, 10) predict that the fluorescence bits per second metric is dominated by a prefactor (Aτvλ^{−} in the analytical case), and that the fluorescence bits per AP metric saturates at larger information values. Our numerical solutions provide more accurate measures for the magnitude of these effects, and for the magnitude of information values themselves, given that they include the more accurate double exponential kernel, signal noise, and the realistic nonstationary speed, position and fluorescence signals. These quantitative differences can be seen in Extended Data Figure 4-3, where we directly compared this analytic approximation to our numerical approach by simulating 10,000 neurons with Gaussian rate maps (*c*, 0.041 vs 0.0036

### Guidelines for application of information metrics to functional fluorescence imaging data

Taken together, the above results suggest that across the information metrics applied directly to functional fluorescence traces, the SMGM metrics provide the most reliable and interpretable information measurements. We thus suggest the following guidelines for use and interpretation of the SMGM metrics as applied to fluorescence MI metrics (

The SMGM bits per second metric (

#### Guideline 1

First, we note that if experimental measurements reveal small and acceptable variations in
c across the neurons of interest, then the information values derived from

Under the assumption of a consistent kernel, approximations for c for common indicators can be found in Extended Data Figure 2-2.

#### Guideline 2

Further, given small variations in
c across the neurons of interest, the ratio of

#### Guideline 3

The metric can still be useful even if experimental measurements reveal large and unacceptable variations in
c across the neurons of interest, or if experimental measurements of
c do not exist. In such cases, since it is reasonable to assume that
c is consistent in the same neuron over time, comparisons across the same neuron can provide meaningful insights by using a ratio of

Therefore, we conclude that with careful consideration of the (known or unknown) variability of the fluorescence response kernel (
c),

The SMGM fluorescence bits per AP metric (*B–D*). Researchers could potentially optimize the recovery of ground truth information by appropriately selecting bin size for a particular indicator (see Extended Data Fig. 3-2*A*,*B*).

In practice, using gCaMP6f and the rodent spatial behavior and spatial bin sizes (5 cm) used here, our analysis suggests that *E–G*), since this is the point where the absolute error exceeds 10% [comparable to the mean absolute error when measuring information from AP data (8.4%)]. Equivalent thresholds for other common indicators are shown in Extended Data Fig. 3-1. The error is exacerbated by slower indicators and thus more accurate measurements of information will result from using the fastest, narrowest kernel indicators available, assuming signal-to-noise and detection efficiency are comparable across the different width indicators.

#### Guideline 4

We conclude that with careful consideration of the size of the spatial bins in relation to the spatial shift and smoothing induced by the indicator,

Previous research quantifying information in bits per AP using

### Example: application of information metrics to functional fluorescence imaging data from hippocampus during spatial behavior

In this section, we demonstrate use of the above guidelines for proper application and interpretation of the SMGM fluorescence MI metrics (

CA1 neurons expressing gCaMP6f (viral transfection, *Camk2a* promoter) were imaged with two-photon microscopy through a chronic imaging window during mouse navigation along a familiar 1D virtual linear track, as described previously (Fig. 5*A*,*B*; Dombeck et al., 2010; Sheffield et al., 2017; Radvansky and Dombeck, 2018). Eight fields of view from four mice were recorded in eight total sessions (recording duration 8.8 ± 1.3 min, number of traversals/session: 29 ± 2.5, 3.6 ± 0.3 laps/min, 3-m-long track). From these eight sessions, 1500 neurons were identified from our segmentation algorithm (see Materials and Methods), and analysis was restricted to the 964 neurons that displayed at least one calcium transient on at least 1/3 of the traversals during the session. Among these 964 neurons, 304 (31.5%) had significant place fields and were thus identified as place cells (see Materials and Methods), while the remaining 660 (68.5%) did not pass a place field test and were thus identified as non-place cells.

By applying Equation 7 (using 5-cm sized spatial bins), we found a continuum of spatial information values measured by the fluorescence SMGM bits per second metric (*C*). The units for *p* = 1.7e-63; Fig. 5*D–F*), although there was substantial overlap in information between the populations (see distributions in Fig. 5*D* and individual examples in Fig. 5*E*,*F*). This also allows for accurate division of the 964 neurons into three quantiles based on information values, which we use below for spatial location decoding.

By applying Equation 8 (using 5-cm sized spatial bins), we found a continuum of spatial information values measured by the fluorescence SMGM bits per AP metric (*C*). The units for *p* = 4.6e-21; Fig. 5*D–F*). This is consistent with mock fluorescence traces generated from real neuron AP datasets (Extended Data Fig. 5-1*B*).

As a demonstration of the usefulness of using information metrics to analyze large functional fluorescence population recordings, we explored the accuracy of decoding the animal’s track position using different subsets of neurons. We divided the 964 neurons into nine groups: all neurons, place cells, non-place cells, three quantiles based on the fluorescence SMGM bits per second metric, and three quantiles based on the fluorescence SMGM bits per AP metric. We then used a Bayesian decoder of the animals’ position (see Materials and Methods) separately for each of the nine neuron groups in each of the eight sessions (Fig. 5*G*,*H*). An individual session decoding example can be seen in Figure 5*G*. We quantified decoding accuracy using the absolute position decoding error (% of track), and pooled this measure across sessions for each neuron group (Fig. 5*H*). The means and standard errors for each group are: all neurons (7.33 ± 2.5%), place cells (6.97 ± 1.9%), non-place cells (20.9 ± 1.8%), SMGM bits per second Q1 (21.9 ± 1.5%), SMGM bits per second Q2 (13.2 ± 2.4%), SMGM bits per second Q3 (8.97 ± 2.4%), SMGM bits per AP Q1 (17.6 ± 2.7%), SMGM bits per AP Q2 (17.8 ± 3.1%), SMGM bits per AP Q3 (10.4 ± 3.0%). Interestingly, even the lowest quantile information groups still could be used to determine animal track location to within ∼1/5 of the track. This supports the idea that the hippocampal code for space is carried by a large population of active neurons (Meshulam et al., 2016), and not just by a select subpopulation with the highest information or most well-defined tuning curves. As could be expected, place cells encoded the position of the animal better than nonplace cells and better than the lowest quantile information groups (Holm–Bonferroni corrected rank-sum,

## Discussion

Here, we performed an in-depth simulation study to examine the application of the SMGM bits per second and SMGM bits per AP metrics of MI to functional fluorescence recordings. Since these metrics were designed for AP recordings and since functional fluorescence recordings violate some of the assumptions that these metrics are based on, it was unclear whether and how the metrics could be used for functional fluorescence recordings. We created a library of ten thousand mock neurons whose AP output carried ground-truth amounts of information about the animal’s spatial location, and by using real behavioral recording data from mice navigating in virtual linear tracks, we simulated the spatial firing patterns of the mock neurons. We then simulated fluorescent calcium responses for each neuron in each session by convolving the AP trains with calcium kernels for different indicators, primarily GCamp6f (although see Extended Data Figs. 2-2, 3-1 for results from other indicators), and then added noise.

We then derived fluorescence versions of the SMGM bits per second (

In our approach, the known information values in our library of 10,000 mock neurons were determined using the SMGM metric, which includes the assumption that neuron firing follows an inhomogeneous Poisson process. It is important to remember that the SMGM metric, which has been applied to spiking data extensively over the past few decades, requires the use of a Poisson estimate of spiking probability, i.e., the Poisson assumption is built into the original metric. In practice, even spiking data violates this and other assumptions of the SMGM metric since real neurons do not strictly follow Poisson statistics (for example, they can display neural hysteresis) and animal behavior is non-stationary. Here, we are building from this existing framework and adding and testing whether it is possible to apply the metric to functional fluorescence datasets. Even still, the Poisson assumption could have contributed to some of the biases found when evaluating the fluorescence SMGM metrics with respect to ground truth information. We explored this potential source of bias further using two different analyses. First, in Extended Data Figure 4-2, we applied the binned estimators (which do not rely on a Poisson firing assumption) to AP traces and compared the estimated information to the ground truth information (which was established using the SMGM metric that does rely on a Poisson firing assumption). We found the errors to be relatively small, particularly in comparison to the errors induced by the binned estimators when applied to fluorescence traces (Fig. 4*D*,*E*). Second, in Extended Data Figure 5-1, we used a real spiking dataset from hippocampal neurons in mice running on a behavioral track (i.e., real spiking neurons that can deviate from Poisson firing) and generated mock fluorescence traces from the AP traces. When we compared the information measured from the AP traces to the fluorescence traces, we found biases that were largely consistent with those observed in Figures 2, 3 from our simulated mock neuron datasets. Taken together, these analyses indicate that any biases resulting from the Poisson assumption in the simulation procedure appear to be small, particularly with respect to the biases introduced when AP traces are transformed into functional fluorescence traces. Finally, in the Toolbox, we also include code to generate mock neurons using a binned distribution, avoiding the Poisson assumption of SMGM. Thus, users can further explore sources of bias using a different ground truth dataset.

Using our mock fluorescence traces, we also asked whether an AP estimation method could relieve the biases in the SMGM metrics. Applying the SMGM bits per second metric (*c* value for recovered versus ground truth information. When the SMGM bits per AP measure was applied (

Taken together, we find that the SMGM bits per AP metric can well recover the MI between spiking and behavior. The SMGM bits per second metric is scaled such that comparisons should be limited to within populations of well characterized neurons or for within neuron comparisons, e.g., ratios of information across conditions. In general, researchers should use caution when applying measures developed for AP data in fluorescence recordings: there’s no guarantee that the assumptions that support the measures hold for fluorescence data, and this can lead to difficult to interpret and biased results.

## Acknowledgments

Acknowledgements: We thank B. Kath and members of the Dombeck Lab for comments on this manuscript and V. Jayaraman, R. Kerr, D. Kim, L. Looger, and K. Svoboda from the GENIE Project (Janelia, Howard Hughes Medical Institute) for GCaMP6.

## Footnotes

The authors declare no competing financial interests.

This work was supported by The McKnight Foundation, Northwestern University, The Chicago Biomedical Consortium with support from the Searle Funds at The Chicago Community Trust, National Institutes of Health Grants R01MH101297 and T32AG020506, and the National Science Foundation Grant CRCNS1516235.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.