Abstract
Animals and humans rapidly detect specific features of sounds, but the time courses of the underlying neural response for different stimulus categories is largely unknown. Furthermore, the intricate functional organization of auditory information processing pathways is poorly understood. Here, we computed neuronal response latencies from simultaneously recorded spike trains and local field potentials (LFPs) along the first two stages of cortical sound processing, primary auditory cortex (A1) and lateral belt (LB), of awake, behaving macaques. Two types of response latencies were measured for spike trains as well as LFPs: (1) onset latency, time-locked to onset of external auditory stimuli; and (2) selection latency, time taken from stimulus onset to a selective response to a specific stimulus category. Trial-by-trial LFP onset latencies predominantly reflecting synaptic input arrival typically preceded spike onset latencies, assumed to be representative of neuronal output indicating that both areas may receive input environmental signals and relay the information to the next stage. In A1, simple sounds, such as pure tones (PTs), yielded shorter spike onset latencies compared to complex sounds, such as monkey vocalizations (“Coos”). This trend was reversed in LB, indicating a hierarchical functional organization of auditory cortex in the macaque. LFP selection latencies in A1 were always shorter than those in LB for both PT and Coo reflecting the serial arrival of stimulus-specific information in these areas. Thus, chronometry on spike-LFP signals revealed some of the effective neural circuitry underlying complex sound discrimination.
Significance Statement
Primary auditory cortex (A1) and lateral belt (LB) areas are key subdivisions of auditory cortex. A1 plays crucial role in processing of simple stimuli such as pure tones (PTs), whereas LB for processing of complex sounds. Both areas receive direct inputs from medial geniculate nucleus (MGN) and have recurrent connections. Nonetheless, the functional connectivity patterns between these subdivisions while processing different sound categories are poorly understood. Using simultaneous spike-LFP recordings, our study reveals that information about the presence of stimuli in the environment arrives concurrently in A1 and LB; however, the information related to neuronal discrimination may arrive at different times, indicating that both parallel and serial information transmission pathways exist and their presence is guided by the context of the task.
Introduction
Simple auditory stimuli such as pure tones (PTs) are represented as tonotopic maps in primary auditory cortex (A1; Hind, 1953; Merzenich et al., 1976; Romani et al., 1982; Morel et al., 1993), whereas belt areas, lateral and medial to the core, while still showing cochleotopic organization, process more complex features of sounds (Muscari et al., 1990; Rauschecker et al., 1995; Tian and Rauschecker, 2004; Recanzone, 2008; Kuśmierek and Rauschecker, 2009; Niwa et al., 2013; Kikuchi et al., 2014). The core is primarily defined based on the thalamic connections from the ventral division of the medial geniculate nucleus (MGN) and reciprocally connected with the adjacent subdivisions of the belt (Hackett, 2011; Scott et al., 2015). Thus, the functional organization of complex sounds in core and belt can be hypothesized to follow a serial processing stream, from core to belt, somewhat analogous to V1 and the V2/V4/MT areas of the visual system (Tian et al., 2013). At the same time, direct inputs from the MGN to these brain areas point to parallel processing pathways (Rauschecker et al., 1997), which continue further downstream (Sheline et al., 2010). Finally, demands of a task, such as sound localization, categorization, and discrimination, can also govern the serial versus parallel characterization of processing (Ahveninen et al., 2006, 2013; Bizley and Cohen, 2013).
Chronometry of input and output related processing events in candidate brain areas is a useful technique for functional network identification (Kreiman et al., 2006; Nielsen et al., 2006; Monosov et al., 2008; Banerjee et al., 2010, 2012). While neuronal spike discharge is used as a measure of output processing in a putative brain area (Krüger and Becker, 1991; Middlebrooks et al., 1994; Nawrot et al., 2003; Buzsáki et al., 2012), local field potentials (LFPs) may carry information about the inputs coupled with local neuronal processing that need not be input specific, in a particular brain area (Gusnard et al., 2001; Nielsen et al., 2006; Buzsáki et al., 2012) and by extending this principle to multiple brain areas, aspects of the functional circuitry underlying behavior can be revealed (Hung et al., 2005; Banerjee et al., 2012; Fig. 1). Conceptually, shorter latencies in one area compared to another reflect faster processing and greater relevance of the former brain area and thus indicate more efficient neuronal coding (Gawne et al., 1996; Van Rullen and Thorpe, 2001; Bendor and Wang, 2008). Additionally, the timing of input versus output of information processing in an area can be used to infer the role of this area in processing of a particular type of signal as well as the functional pathways involved in processing of the signal (DiCarlo and Maunsell, 2005).
A, Simultaneous recordings from two arbitrary brain areas 1 and 2. On the right, we illustrate the definition of onset latencies (OLs) and selection latencies (SLs) by plotting the spike response (left panel) and LFP (right panel) from each recording site. OL is computed using an event as model 1 in the AccLLR framework (Banerjee et al., 2010) and prestimulus baseline as model 2 (see Eq. 1). SL is computed using PT stimulus as model 1 and Coo as model 2. B, The effective network architectures inferred from different OL values. The solid lines reflect the effective network connections, whereas the dotted lines indicate a less likely connection that can be inferred from latency measures.
Spike trains and LFPs in auditory core exhibit comparable frequency tuning (Kayser et al., 2007). On the other hand, there is evidence suggesting that the cochleotopic organization of belt areas is less precise, as observed in spike-LFP responses (Guo et al., 2012). Hence, identifying the temporal markers of inputs and outputs involved in information processing in auditory core and belt across single units and populations can help reveal the functional specificity of the respective areas. Extending this line of reasoning, Camalier and colleagues computed neuronal onset latencies at different locations along the auditory cortical pathways and reported that dorsal stream locations have shorter latencies, whereas the ventral locations exhibit increasingly longer latencies as one proceeds from lower to higher-order processing (Camalier et al., 2012). This result conforms with human studies using magnetoencephalography and transcranial magnetic stimulation (Ahveninen et al., 2006, 2013) as well as with other monkey studies (Scott et al., 2011; Kuśmierek and Rauschecker, 2014). Kikuchi et al. (2014) reported that PT-related spike onset latencies were longer in lateral belt (LB) than in auditory core, which is consistent with the notion that auditory core is at a lower hierarchical level within cortex than LB. However, do the two areas receive information about stimulus presence concurrently? Furthermore, are the finer features that allow discrimination of one signal from another represented in the neural codes hierarchically?
To address these questions, we recorded spike and LFP responses simultaneously from A1 and LB of two adult macaques while they performed an auditory Go/No-go discrimination task. We computed trial-by-trial onset Latency, time locked to stimulus onset, and selection latency, the earliest time at which neural responses between PTs and Coos significantly differ. Computing these measures in different subdivisions of auditory cortex, we could tease out the functional network mechanisms involved in sound processing and discrimination.
Materials and Methods
Animal preparations and behavioral task
Two adult male rhesus macaques (Macaca mulatta, weighing 7.5–11.5 kg) participated in this study. Animal care and all procedures were conducted in accordance with the National Institutes of Health guidelines and were approved by the Georgetown University Animal Care and Use Committee. Animals were prepared for chronic awake electrophysiological recordings under aseptic conditions. Each animal was anesthetized and a head post and recording chamber were attached to the dorsal surface of the skull with a guidance of MRI obtained with a 3T scanner (0.5 mm voxel size, Siemens Tim Trio). The recording sites in this study cover the auditory core (A1) and the auditory LB region [the middle lateral (ML) and anterolateral (AL)]. We followed identical methods for assigning the recording sites to either A1 or LB as described in Kikuchi et al. (2014).
Electrophysiological experiments were conducted in a single-walled acoustic chamber (Industrial Acoustics Company) installed with foam isolation elements (AAP3, Acoustical Solutions). The animal sat in a monkey chair with its head fixed, facing a speaker located one meter directly in front of it in a darkened room. The animal was trained to perform an auditory discrimination task, in which a single positive stimulus (S+), consisting of a 300-ms pink-noise burst (PNB), was pseudo-randomly interspersed among negative stimuli (S-), consisting of all other stimuli, for 20% of the trials. The animal initiated a trial by holding a lever for 500 ms, triggering the presentation of one of the acoustic stimuli, was required to release the lever within a 500-ms response window after the offset of the S+ to get a water reward (∼0.2 ml) that followed by a 500-ms intertrial interval (ITI). Lever release in response to S- prolonged the 500-ms ITI by 1 s (timeout). The average interstimulus interval was 2.3 ± 0.45 s (mean ± SD). The detailed procedures for the animal preparations, behavioral task, and data collection were the same as those described in Kikuchi et al. (2014).
Sound preparation and stimuli
The sound wave form signals were sent from the CORTEX dual-computer system through a 12-bit D/A converter (CIO-DAS1602/12, ComputerBoards), and then amplified, attenuated, and delivered through a free-field loudspeaker (Reveal 6, Tannoy) with a flat (±3 dB) frequency response from 63 Hz to 51 kHz.
The monkey vocalizations (“Coo” calls) were recorded in Morgan Island using a directional microphone (ME66 with K6 powering module, Sennheiser, frequency response at 40–20,000 Hz ± 2.5 dB) with a solid-state portable recorder (PMD670, Marantz Professional) at a sampling rate of 48 kHz (Laboratory of Neuropsychology, National Institute of Mental Health). PTs and PNBs were created using Adobe Audition 1.5 at a sampling rate of 48 kHz (32 bit). All stimuli had a 300-ms fixed duration, including the monkey vocalizations, gated with a 5-ms rise/fall linear ramp. The stimuli were normalized across all stimuli by recording the stimuli played through the stimulus presentation system and filtering the recorded signal on the basis of Japanese macaque audiograms (Jackson et al., 1999), and using the maximum root mean square (RMS) amplitude during a sliding window of 200-ms duration and presented at ∼70-dB SPL. Details of the sound equalization method were described by Kuśmierek and Rauschecker (2009).
The positive stimulus was a pink noise, a response to which led to a reward, whereas the negative stimuli were made up of both PTs and Coo vocalizations. A stimulus set comprised of 10 PTs and 10 pitch-matched Coos, in which the fundamental frequency (F0) of the Coo was match to the corresponding frequency of PT using the pitch-shift function in Adobe Audition 1.5 (Fig. 2). The frequency of PTs and the F0 of the Coos ranged from G3 (196 Hz) to C#8 (4435 Hz) in six semitone steps. In each recording session, the stimuli were presented in pseudorandom order with at least 15 trials per stimulus.
Experimental design, Go/No-go task. A, Monkey waits during a rest period with hands on a lever and attends to the stimuli (PT, Coo, or pink noise; presentation time, 300 ms). To obtain a water reward, the monkey must release the lever when pink noise is presented. The next trial starts 600 ms after the previous stimulus onset. B, The spectrogram (time, frequency, and power) of PT and Coo stimuli. The frequency of PTs matches the fundamental frequency of the Coo.
Data collection and preprocessing
Multiple guide tubes carrying up to 4 tungsten microelectrodes (0.5–3.0 MΩ, epoxylite insulation, FHC) was lowered into the target cortical sites identified on the MRI scans. Each electrode was independently advanced using a remote-controlled hydraulic, four-channel customized multidrive system (NAN-SYS-4, Plexon. Inc.). For the spike trains, raw signals were filtered with a bandpass of 150–8000 Hz, further amplified, and then digitized at 40 kHz. For the LFP, the raw voltage traces were filtered between 0.7 and 500 Hz, amplified, and digitized at 1 kHz. For further analyses, the LFP data were low-pass filtered at 100 Hz. Time stamps for stimulus presentation timings, behavioral response, and reward delivery were sent through DOS-CORTEX dual computer system (CIO-DAS1602/12, CIO-DIO24, ComputerBoards). Spikes were sorted by real-time acquisition programs using template matching and Principal Component Analysis (PCA) methods (RASPUTIN, Plexon). We focused on trials in which simultaneous spike-LFP recordings were obtained from both monkeys in both core and LB areas. Overall, we accumulated data from 29 sessions in Monkey1 and 27 sessions from Monkey2, for a total of 56 sessions, where a session was defined as a group of trials for which we obtained simultaneous spike train recordings from one neuron in A1 and one in LB. Two sessions may have different single cells (spike-sorted) but the same LFP representation. We aggregated all 23 fundamental frequencies presented to the monkeys under a single category called PT trials. Similarly, all F0-matched monkey calls were categorized as Coos. This enhanced the statistical power of our analytical framework but did not adversely affect the main goals of the study. Hence, to increase the statistical power of our analysis, we chose to categorize all PT trials as one block and the F0-matched Coo trials as a different block. Firing rates reported in Figure 3 were computed from binary spike rasters by applying Gaussian smoothing with a 10-ms window on the averaged peristimulus histogram (PSTH) with a bin size of 1 ms. The mean evoked LFP wave form was calculated by averaging LFPs across trials.
One representative session from each monkey, where simultaneous recordings from two areas spikes and LFP could be obtained. First row indicates spike rasters (cyan and magenta dots) and firing rates (blue and red) computed using Gaussian smoothing (10-ms window) for PT (cyan/blue) and Coo (magenta/red) stimuli. The second row depicts the trial-by-trial LFP waveforms using the same color code as for spikes. The averaged LFP responses are plotted in blue and red. The spike-LFP responses in two auditory cortical areas A1 and LB were recorded during the same session in each monkey.
Trial-averaged latency analysis
Histograms of binary spiking events were computed using 1-ms bins and were convolved with growth-decay functions (Thompson et al., 1996; Monosov et al., 2008) to compute continuous spike density functions (SDFs). Time constants for growth phase,
and for decay phase,
were used to compute the spike density fuctions following Thompson et al. (1996). A ms-by-ms t test was applied to the two SDFs either within different temporal segments of the same trial (for onset) or between trials from different conditions (for discrimination) to obtain the onset and selection latencies, respectively, over an entire session (Fig. 4). As LFPs are continuous signals, the raw LFP traces (band-passed between 0 and 200 Hz) were used to compute onset and selection latencies. Pairwise Wilcoxon rank-sum tests were performed to establish significant effects. We report the statistical analysis performed on data pooled from both monkeys and set a threshold of p = 0.01 for estimating significance. We set the threshold to this slightly conservative value since there were a large number of trials in each session that were available for the trial-by-trial analysis (see below).
A, Estimation of trial-averaged onset and selection latency using ms-by-ms t test on raw time series. For spikes, the binary time series was transformed to a SDF by convolving single trial spike trains with assymetric exponential functions having different growth and decay time constants, 1 and 20 ms, respectively, following Thompson et al. (1996). A ms-by-ms t test was performed on the distribution SDFs in a given session from different conditions (see text for details). For onset, prestimulus rest period was used to compute the SDF. Analyses were performed across both monkeys and for each monkey individually. Error bars were plotted at 95% significance levels. B, Estimation of trial-averaged onset and selection latency from AccLLR distributions using ms-by-ms t test; p = 0.01 was chosen as threshold for significance.
Trial-by-trial AccLLR analysis
Spike trains and LFPs follow different statistical properties and hence the estimation of single-trial latencies from these two signals requires a unified framework (Banerjee et al., 2010, 2012). AccLLR addresses this issue and computes spike-LFP latencies trial-by-trial (Fig. 5). AccLLR is a model-based framework that requires two competing models of observations. We have used time-varying firing rate models for spiking (inhomogeneous Poisson process) and time-varying continuous means and standard deviations (Gaussian process) for LFP signals. For further discussion on different kinds of models, see Banerjee et al. (2010, 2012). Once the model parameters (time-dependent firing rate for spikes and mean and standard deviation for LFP) are computed from a set of training trials, the likelihood that the time series for a test trial (binary spike trains for spikes, continuous wave form for LFP) belongs to model 1 or model 2 can be computed. Finally, raw spike trains and continuous LFPs can be transformed into the space of accumulated log-likelihood ratios by first calculating likelihood ratios (LRs)(1)
where
is the data point at which LR is computed. To compute the LRs, we use the leave-one-out principle. The trial at which LR was computed does not contribute to obtaining the model parameters. The rest of the trials are used in model development. This was done to minimize the bias of any particular model.
A, Estimation of trial-by-trial onset and selection latency using AccLLR. We follow the same pattern of presentation as in Figure 4 and report the group-level analysis and individual monkey analysis. Error bars were computed at 95% significance levels by pooling all trials and sessions in a monkey. See text for details of Methods. B, Estimation of trial-by-trial onset and selection latency using AccLLR on pooling all trial information within a stimulus category to a single trial in each session and setting accuracy to 100%. Error bars were computed at 95% significance levels.
By integrating the natural logarithm of
over time we obtain accumulated log-likelihood ratios
, which follow a drift-diffusion process (Gold and Shadlen, 2001; Eckhoff et al., 2008; Banerjee et al., 2010). Thus, the difference in statistical properties of spike trains and LFPs become inconsequential in the space of AccLLRs, which unifies these measurements. Latencies are computed setting bounds specific to a model (1 or 2) of AccLLRs (Fig. 6).
Decoding performance using AccLLR. A, Setting up the bounds of accumulation is an integral part of AccLLR analysis. The probability of correct detection varies with where the bound is set for both spikes and LFPs. Furthermore, the onset latency also varies with the selection of thresholds and, consequently, with the probability of correct detection. Optimal onset latency detection is defined when the threshold for the false positive rate for prestimulus data (null) equals or exceeds the detection of true positives from event (poststimulus period) data. For selection latency, there are three possibilities: PT, Coo, or don’t know (baseline). Here, the optimal threshold was chosen when the probability of correct detection matched the probability of don’t knows from the rest period (null) data. B, Error rates of decoding from spikes (1st column) and LFPs (2nd column). Error rates for onset (1st row) and selection latency (2nd row) are also shown in matching color codes. Note that y-axis is error, lower error indicates better performance.
An important aspect of the AccLLR framework is that it sets the bounds on the accumulation of integrated log-likelihood ratios, ordinarily done using the sequential probability ratio test (SPRT; Wald and Wolfowitz, 1947). Under this framework, accumulated log-likelihood ratios obtained using Equation 1 reaches a decision threshold after “sufficient” information has been collected. Alternatively, information is sufficient to make a decision when a certain threshold is reached. At an asymptotic limit, a mathematical relationship connecting the location of bounds of AccLLR accumulation to false positive and false negative rates can be expressed (Wald and Wolfowitz, 1947).
For the purpose of decoding latencies within a biologically relevant time, we chose a data-driven approach to set the bounds on AccLLR accumulation (Banerjee et al., 2010). For a given poststimulus event as model 1, there are two possibilities for detection within a finite time, viz, whether the event is correctly detected (true positive) or no detection is possible (false negative). On the other hand, for the prestimulus baseline (null) as model 2, either correct (true negatives) or incorrect (false positives) assignment is made. For setting a bound for onset detection, we chose an optimum threshold for which the false positive rate for null data equals or exceeds the detection of the true positive rate on event data. For setting bounds for selection latency detection, we first computed the AccLLRs for a “null” period (prestimulus baseline), 300 ms from the start of a trial. There are three potential outcomes: AccLLR reaches (1) an upper threshold corresponding to hit rate for model 1; (2) a lower threshold corresponding to hit rate for model 2; and (3) does not reach either threshold (“don’t know”). Again, the threshold for detecting model 1 was chosen at an optimal point where the probability of don’t knows exceeds the hit rate for model 1. Similarly, the threshold for detecting model 2 was chosen at an optimal point where the probability of don’t knows exceeds the hit rate for model 2 (Fig. 6). For further details, see Banerjee et al. (2012).
While decoding latencies at the level of single trial brings us close to revealing the true nature of neural processing occurring at a realistic time scale, nonetheless, the process of choosing a threshold is impacted by speed-accuracy trade off, meaning a lower threshold can make detection faster while increasing the false positives, and on the other hand a higher threshold can increase accuracy but also increase the onset and selection latencies. Hence, to check that the consistency of latency results are extended to situations where accuracy is set at 100%, we pooled all log-likelihood ratios from all trials within a session to create a pseudo-trial. Accumulated log-likelihood ratios were computed on this trial for each detection context, onset and selection. AccLLR threshold for onset detection was chosen to be the maximum AccLLR reached by a null trial. Similarly, AccLLR threshold for selection of one category of stimulus (model 1) was determined by the maximum reached by AccLLR from the second category (model 2).
Results
Spike-LFP recordings were obtained simultaneously from two brain areas, A1 in the auditory core and LB. Our recordings in LB came from two subdivisions, AL and ML belt areas. Here, we were interested in trials where simultaneous spike-LFP recordings were obtained from both monkeys in both A1 and the LB areas. We accumulated 29 sessions in Monkey1 and 27 sessions in Monkey2, totaling 56 sessions in which simultaneous spike-LFP recordings were obtained in A1 and LB. We computed the onset latency of the neural response of either spike or LFP using the method of accumulated log-likelihood ratios (AccLLR; for details, see Banerjee et al., 2010). According to this framework, for single/multiunit spiking activity, the baseline can be the background firing rate during the prestimulus period. Analogously, in the case of LFP, the baseline can be the distribution of voltage traces during the prestimulus period. We computed the timing of information processing events from trial-by-trial spike-LFP data (for further details of the method, see Banerjee et al., 2010, 2012).
Monkeys performed the auditory discrimination task in a Go/No-go setup illustrated in Figure 2. Monkeys were trained to discriminate different kinds of sounds (all negative, or No-go cues) from a pink noise stimulus (positive, or Go cue), which, when responded to, resulted in a water reward. Onset latency and selection latencies were computed from spike-LFP responses. Onset latency characterized the boundaries of a processing stage required for encoding the presence of sound in the environment, thereby a measure of stimulus-related processing. On the other hand, selection latency characterized the boundaries of a processing stage involved in coding the presence of a specific sound in the environment, hence yielding a measure of stimulus-specific processing. Figure 3 illustrates an example recording session in each monkey. In Monkey1, we observed a transient increase in spike frequency around the beginning and end of the stimulus in A1, whereas we saw sustained spiking responses in LB. Simultaneously, a difference in LFP waveforms is observed during stimulus presentation for the two stimulus categories. In Monkey2, we observed sustained firing in A1 following a transient rise of spike rate at stimulus onset. Furthermore, LFP differences were observed primarily between two stimulus categories in a period following the termination of stimulus presentation. These examples illustrate the diversity and complexity of spike/LFP responses across different recording sessions in both A1 and LB.
Chronometry on spike-LFP responses
We computed neuronal response latencies for onset and discrimination using two approaches: a traditional trial-averaged approach and single-trial AccLLR analysis (Banerjee et al., 2010) of spike-LFP data. The former gives a broad summary of the results, and the latter helps in addressing the between-trial variability in neural signals and gives a more consistent account of neuronal information processing. In the first approach, a ms-by-ms t test (Monosov et al., 2008) was used to compute trial-averaged measures of latencies. This approach is a standard one, used by most investigators. In the second approach, the AccLLR framework was used to compute trial-by-trial latencies of onset and selection (Banerjee et al., 2012). Additionally, we applied ms-by-ms t test on trial-by-trial AccLLR distributions for each session to compute the trial averaged latencies. In both cases, simultaneously collected data from two brain regions (A1 and LB) were used. We report statistics performed over all sessions from two monkeys in both the text (p values) and Table 1 (mean and SEM).
Mean neuronal latencies for onset and selection of auditory stimulus with SEM reported in parentheses
Trial-averaged latencies from raw data
Ms-by-ms t test was applied to raw LFP traces and spike distribution functions (for details, see Materials and Methods) to extract spike-LFP latencies as followed by an earlier study (Monosov et al., 2008). Analyses of the combined data from both monkeys are presented in Figure 4A, Table 1 (results from each monkey are also presented separately in Fig. 4) for a sample size of 56 sessions (29 for Monkey1 and 27 for Monkey2). Note that we rounded latencies to whole numbers for reporting group averages and that p = 0.01 was chosen as the threshold in pairwise t tests used to evaluate the statistical significance of both trial-averaged and AccLLR analysis across stimuli categories and brain areas. For PTs, mean LFP onset latency in A1 (45 ms) was not significantly different (p = 0.22) from mean LFP onset latency in LB (57 ms). The same was true not true for Coo (p = 0.001, mean 29 ms in A1, 45 ms in LB). However, the stimulus-specific differences between LFP onset latencies (i.e., between PT and Coo) were significant in A1 (p < 0.01) but not in LB (p = 0.02).
Mean spike onset latencies for PT were 87 ms in A1 and 103 ms in LB. Mean spike onset latencies for Coo were 63 ms in A1 and 83 ms in LB. No stimulus-specific differences were observed for spike onset latencies (PT vs Coo) in either A1 (p = 0.1) or LB (p = 0.25). However, spike-LFP onset latency differences in A1 and LB were significant for both stimuli (PT and Coo; p < 0.01 for all four comparisons). These results suggest that inputs arrive at A1 and LB from a shared source and that there is considerable parallel processing across the two areas.
Trial-averaged selection latencies are computed from ensembles of trials belonging to two stimulus categories. That is, one selection latency value is defined for two stimulus categories for each session of recording. LFP selection latencies in A1 and LB were not significantly different (p = 0.44, mean 111 ms in A1 and 121 ms in LB; detailed statistics are presented in Table 1). Similarly, spike selection latencies across A1 and LB were also not significantly different (p = 0.41, mean 140 ms in A1 and 164 ms in LB).
Spike-LFP selection latency differences were not significant in LB (p < 0.01) and A1 (p = 0.03) with the threshold level set at p = 0.01. Thus, hierarchical stimulus processing from A1 to LB cannot be inferred from this analysis. A largely similar pattern of results was observed when the analysis was repeated for each monkey separately (Fig. 4A).
Trial-averaged latencies from AccLLR estimates
AccLLR at the level of single trials is a probabilistic method dependent on accumulation evidences. Inherently, it has a “slowness” incorporated to it which is further dependent on signal-to-noise ratios. Hence to get a sense of the speed-accuracy trade-off that affects AccLLR analysis we applied ms-by-ms t tests on distribution of AccLLRs to compute the trial-averaged latencies (Fig. 4B; Table 1). This will give an estimate of latencies that can be achieved with maximum accuracy using AccLLR analysis for the ensemble of trials and sessions.
Mean LFP onset latency for PT was 60 ms in A1 and 57 ms in LB and was not significantly different (p = 0.61). Mean LFP onset latency for Coo was 51 ms in A1 and 55 ms in LB and was not significantly different (p = 0.09). The stimulus-specific differences between LFP onset latencies (i.e., between PT and Coo) were not significant in A1 (p = 0.06) and LB (p = 0.02).
Mean spike onset latencies for PT was 67 ms in A1 and 87 ms in LB, which were not statistically significant (p = 0.73). Mean spike onset latencies for Coo were 74 ms in A1 and 94 ms in LB, which were not significantly different (p = 0.62). We did not find any stimulus-specific differences for spike onset latencies (PT vs Coo) in A1 (p = 0.45) and LB (p = 0.47). Spike-LFP latencies were significant different for Coo in A1 (p = 0.01) but not for PT (p = 0.58). Spike-LFP latencies were not significantly different in LB for both Coo (p = 0.03) and PT (p = 0.02) stimuli.
LFP selection latencies in A1 (155 ms) and LB (170 ms) were not significantly different (p = 0.25). Analogously, spike selection latencies in A1 (89 ms) and LB (106 ms) were not significantly different (p = 0.24). No significant differences were found between spike-LFP selection latencies in A1 (p = 0.06) and LB (p = 0.25).
Trial-by-trial AccLLR analysis
AccLLR (Banerjee et al., 2010) was used to compute trial-by-trial onset and selection latencies (Fig. 5; Table 1). Spike and LFP data were transformed to AccLLR space using inhomogeneous Poisson models for spikes and Gaussian models for LFP. Latencies were computed by setting decision bounds on AccLLR time series. For onset latency estimation, model 1 was applied to spike/LFP data during the stimulation period and model 2 to prestimulus baseline; for selection latency, model 1 was applied to PT trials and model 2 to Coo trials (for details, see Materials and Methods). Latencies were estimated using the leave-one-out rule, where model parameters were estimated from all other trials leaving aside the one for which the latency was being computed. Two major advantages of using the AccLLR method were that we could reduce the variability observed in the trial-averaged analysis and that the stimulus-specific selection latencies could be computed trial-by-trial. On the other hand, a definition of single selection latency encompasses at least two trial categories for trial-averaged analysis. The AccLLR analysis had orders of magnitude higher sample sizes than those in the trial-averaged analysis (Table 1). Theoretically, unlike the raw data, AccLLRs from both spike and LFP follow the same statistical distribution (for details, see Materials and Methods), hence spike-LFP comparisons are quantitatively valid. The mean and SEM for onset and selection latencies are reported in Table 1.
The mean LFP onset latencies in A1 and LB for PT stimuli were nearly identical (35 ms in A1, 36 ms in LB, p = 0.10). On the other hand, the mean LFP onset latency for Coos differed significantly in the two areas (31 ms in A1, 39 ms in LB, p < 0.01). Spike onset latencies differed significantly between A1 and LB for PTs (52 ms in A1, 92 ms in LB, p < 0.01). For Coos, the difference in spike onset latencies between A1 and LB is small but significant (69 ms in A1, 66 ms in LB, p < 0.01). Together, these results suggest that processing relatively simpler stimuli like PT can be supported by A1, whereas more complex stimuli such as Coo require resources of a higher order area such as LB. LFP onset latencies always preceded spike onset latencies in each area and each stimulus category (p < 0.0001).
Interestingly, for either type of stimulus, LFP selection latencies were always shorter in A1 than in LB (for PT, means of 113 ms in A1 vs 161 ms in LB, p < 0.01; for Coo, 111 ms in A1, 167 ms in LB, p < 0.01), whereas spike selection latencies were always shorter in LB than in A1. For PT, spike selection latency was 187 ms in A1 and 163 ms in LB, p < 0.01; and for Coo, 178 ms in A1 vs 155 ms in LB, p < 0.01. Most interestingly, for Coo the LFP selection latency (167 ms) lagged the spike selection latency (155 ms) significantly (p < 0.01).
Estimates from pooled trials with 100% accuracy
To evaluate whether the pattern of results holds in a scenario where detection accuracy is 100% (thus taking into consideration the effects of speed-accuracy trade-off), we pooled all trials in a session to create a single trial in the log-likelihood space. Details of procedures of how thresholds were selected are described in Materials and Methods.
The LFP mean onset latencies for PT was very similar in A1 and LB (Table 1), a difference of 8 ms, which was not significant (p = 0.03). A similar pattern followed for Coo (p = 0.26). A1 seems to have lower LFP onset latency for PT (26 ms) compared to Coo (38 ms), but the effect was weak (p = 0.01). In LB, the LFP mean onset latencies were identical for PT (30 ms) and Coo (31 ms; p = 0.87). A similar pattern followed for mean spike onset latencies, and as well was observed for LFP. When spike-LFP latencies were compared except in A1 for PT where spike-LFP latencies were not different (p = 0.12), LFP latencies typically precede spike latencies.
Mean selection latencies for LFP were much lower than that obtained with single trial measures however the main pattern of LFP selection latencies being lower in A1 compared to LB was consistent (p < 0.0001). The mean spike onset latencies were in close proximity and none of the comparisons was significant at p = 0.01. Even the spike-LFP latency differences were not significant for individual selection contexts, for PT in A1 (p = 0.33), Coo in A1 (p = 0.1), PT in LB (0.60), and Coo in LB (p = 0.93).
Decoding performance
An important requirement in any decoding analysis framework is to control for the false positives and false negatives while setting thresholds for category distinction. In principle, the AccLLR test is optimal (Wald and Wolfowitz, 1947). Under conditions in which sufficient information is available or after infinite accumulation, the number of times any threshold is crossed is circumscribed by type 1 and type 2 errors. However, we are interested in latencies which would be biophysically relevant and computed using comparable statistical constraints on spike trains and LFP data. Detection of latencies within a finite time is constrained by a trade-off between accuracy and early detection (Fig. 6A). Hence, we have chosen a data-driven approach to set the optimal thresholds for AccLLR accumulation, details of which are provided in Materials and Methods (Fig. 6). Trial-by-trial onset and selection latency decoding performance were significantly worse than the chance level in most sessions (Fig. 6B). Error rates for most LFP sessions were below the chance level (probability of target detection is achieved by random selection) for both onset and selection. For the onset latency, there are only two detection scenarios, whether the signal can be classified as category 1 (the pattern of spike/LFP response to a stimulus) or category 2 (the animal is alert but not hearing any sound). Hence, the probability of detection by chance is 0.5. For selection latency, the probability of detection by chance is 0.67 since there are three possibilities in a given datum (PT, Coo, or prestimulus baseline). Figure 6B unambiguously illustrates that error rates for selection latency detection from spikes and LFPs were mostly lower than chance level indicating superior performance of the AccLLR technique. Typically, recording sites with good onset detection also yielded superior selection detection and decoding from LFPs were more reliable with more consistent error rates over sessions.
Discussion
Using two measures, onset latency for detecting the presence of sound in the environment and selection latency for identifying stimulus-specific neural codes in primate auditory cortical areas we aim to characterize the functional pathways of underlying information processing. We observed a trend in which LFP onset/selection latencies were shorter than spike onset/selection latencies by applying ms-by-ms t test on time series data. However, the trial-averaged techniques do not allow the measure of stimulus-specific selection latencies since a distribution of PT trials is used to identify the time of selection from a distribution of Coo trials.
AccLLR analysis of our data refined the statistical significance of the trends and helped to mathematically define stimulus-specific selection latencies. In a trial-averaged analysis using the t test on raw data as well as AccLLRs, a single numerical value of selection latency was obtained for all trials within a session and by construction across two stimulus categories. Hence, not surprisingly, latencies computed by AccLLR exhibited variability that were orders of magnitude smaller than trial-averaged tests. Both trial-averaged analysis and AccLLR at the level of single trials as well as accuracy matched pooled trials yielded similar values for LFP onset latencies across A1 and LB. This reinforces the view that areas A1 and LB may process simple stimuli in parallel. Except in case of A1 and PT stimulus, all three onset scenarios had LFP latencies preceding spike latency when accuracy was matched. Proximity of spike and LFP latency typically indicates a central role of the recorded brain area in neuronal processing. Thus, our observations highlight area A1’s dominant role in coding PTs, whereas coding of complex stimulus such as Coo and in areas higher order than A1 are more mixed in nature. Selection latencies for each trial category can be only obtained from AccLLR analysis. Shorter LFP selection latencies for A1 than LB suggest information arrival in auditory brain areas can occur hierarchically. Interestingly both single-trial decoding as well as performance matched pooled trial analysis showed non-significant differences between LFP and Spike selection latency in LB; in particular the performance matched analysis revealed that LFP selection latencies had a trend of preceding spike selection latency thus reflecting a greater involvement of higher order LB area in neuronal stimulus discrimination.
There is a substantial literature on subdivisions of auditory cortical areas and their roles in processing complex sounds (Romani et al., 1982; Rauschecker et al., 1995; Eggermont, 1998; Bendor and Wang, 2008; Ghazanfar et al., 2008; Recanzone, 2008; Kuśmierek and Rauschecker, 2009; Bandyopadhyay et al., 2010; Kikuchi et al., 2010, 2014; Camalier et al., 2012; Sundberg et al., 2012; Niwa et al., 2013). In this study, we investigated one such complex sound, viz., a Coo, that can be represented spectro-temporally as containing higher harmonics of a specific fundamental frequency (Fig. 2), as opposed to a simple sound consisting of a single frequency. The animals were trained to respond to a stimulus that had no periodic temporal structure (pink noise), but that required them to allocate equivalent levels of attention to both simple and complex sounds (PTs and Coos, respectively). A traditional, trial-averaged analysis of the data indicated that the spike-onset latency for the PT was shorter in A1 than in LB (Kikuchi et al., 2014). However, there was a minimal difference in latency between A1 and LB for Coo sounds, a finding that may seem surprising from the perspective of serial hierarchical information processing. We argue that an effective way to tease out the entire processing architecture is to look at simultaneous measurements of inputs and output of a brain area using both spike and LFP recordings. We showed that stimulus-specific spike and LFP responses are present in A1 and LB, as found in previous studies (Ghazanfar et al., 2005, 2008). We then compared single-trial latencies from spike trains and LFPs at the same electrode and across different electrodes. This presents a unique way to extract the local functional connectivity in auditory cortex underlying complex sound processing.
Latency comparison has been used previously to estimate functional neural circuitry underlying complex tasks (DiCarlo and Maunsell, 2005; Hung et al., 2005; Buschman and Miller, 2007; Monosov et al., 2008). The key methodological innovation in the current paper is employing the AccLLR framework, which allows single-trial decoding of latencies from spike/LFP data (Fig. 5). Using AccLLR, we were able to evaluate latencies statistically within one session as well as compare them across sessions and thereby enhance the statistical power of our results. A somewhat similar approach based on the computation of a “surprise index” was proposed earlier by Hanes and colleagues (Hanes et al., 1995). For comparison, we also performed the latency analysis by applying the commonly used method employing a ms-by-ms rank sum test (Fig. 4). Comparison of Figures 4, 5 (AccLLR results) illustrate a dramatic improvement in statistical significance of results for the trial-by-trial analysis. The trial-by-trial analyses as well as pooled trial analysis (accuracy matched) confirm the pattern of results reported by Kikuchi et al. (2014): spike onset latencies were shorter in A1 than in LB for PTs but close to each other for Coos. Error rates from decoded LFPs were higher than corresponding spike-analysis sessions, although across sessions decoding was better than chance, indicating the robustness of the information contained in LFPs. Robust decoding using LFPs was also reported in earlier studies (Hung et al., 2005; Markowitz et al., 2011; Bansal et al., 2012).
Functional neural circuitry underlying auditory processing
A central aim of the current study was to compare latencies of spike and LFP responses in two different contexts, at onset and during neuronal selection. Latencies were compared across stimuli (PT vs Coo) to investigate the stimulus-specific components. A key result from AccLLR analyses (both trial-by-trial and performance matched) was the nearly identical LFP onset latency in A1 for PTs and Coos and the very similar onset latencies in LB for these two stimulus categories (Figs. 4, 5). If we consider LFPs to be coupled more to inputs, the information related to the presence of an auditory stimulus in the environment arrives at both brain areas simultaneously. Previous studies demonstrated that A1 and LB receive inputs in parallel from subcortical structures, which may be the reason that there is little difference in LFP onset latencies across the two areas (Rauschecker et al., 1997; de la Mothe et al., 2006). In the case of sensory areas, where feed-forward connections dominate, relative spike latency can indicate a putative area’s contribution to information processing (VanRullen et al., 2005). In our findings, spike onset latency was usually longer than LFP onset latency in agreement with previous studies in sensory areas (Eggermont, 1998; Sundberg et al., 2012). We observed that the spike onset latency computed from trial-averaged data are shorter in A1 than in LB for PTs but not for Coos. The also followed this trend. Interestingly, spike onset latency for Coo in LB was shorter than spike onset latency for PT using both single trial and performance matched AccLLR analyses (although a clear trend was observed in the latter analysis that matched trial-by-trial results, the latency differences were not significantly different). This validates the view that the auditory cortex is organized into lower-order sensory areas (e.g., A1), relevant for coding simple features such as fundamental frequencies, and (relatively) higher-order LB areas for coding more complex auditory features (Rauschecker et al., 1995; Kikuchi et al., 2010). On the other hand, spike onset latencies for Coo in A1 and LB were not significantly different. This suggests that complex signals require more distributed resources for processing. The aforementioned findings were replicated when statistical analysis was applied to the data from each monkey individually (Fig. 5).
An important point to note here is that the single-trial latencies detected by AccLLR analysis are typically longer than trial-averaged latencies or ones obtained from pooling all trials and setting detection accuracy to 100%. In an earlier stimulus onset latency detection study, Banerjee et al. (2010) showed that latencies computed from trial-averaged AccLLRs can decrease by 15 ms at the expense of an increase in false-alarm rates. In our study, only the LFP onset latencies were very close among trial-averaged, pooled trial AccLLR and single trial AccLLR results. For spike onset latencies, the differences were maximal between trial-averaged and AccLLR measures and same pattern was followed in latency distributions from pooled trials. This indicates that LFPs may have the least variability in recording the presence of an auditory stimulus, and such tight time-locking is most likely due to the subcortical nature of the stimulus processing before it arrives in A1.
Area-specific properties in processing differences between stimuli can be investigated using selection latencies. Shorter LFP selection latencies in A1 compared to LB may reflect the hierarchical organization of these areas vis-à-vis stimulus-specific processing, e.g., dissociating simple (PT) from complex (Coo; Fig. 5). For both trial-averaged and trial-by-trial analysis, spike selection latency in LB was shorter than spike selection latencies in A1, indicating a stronger role of LB in processing stimulus-specific features. Combining this finding with the results from the onset latency analysis, we can dissociate the function of the two brain areas in computing different components of information processing in an environmental signal, i.e., just the presence of sound versus the detailed features of that sound. We did not observe a stimulus-specific difference, PT compared to Coo, in LFP selection latency in the two areas (p = 0.57 in A1, p = 0.01 in LB, latencies reported in Table 1). The effect was robust when the analysis was performed in individual monkeys as well as when latencies were computed by pooling all trials and applying the AccLLR framework (Fig. 5), although it was not present in the trial-averaged analysis from raw time series (Fig. 4). We thus conclude that at least some stimulus-specific information arrives serially in these two brain areas, contrary to what we observed for LFP onset latency. An alternative possibility is that the lower-order auditory area A1 receives feedback projections from LB or other higher-order areas. Spike selection latency in A1 was longer than the LFP selection latency when both trial-averaged and trial-by-trial analyses were performed on individual monkey data as well as on the population data. When detection threshold was set at 100% in pooled trials this difference in spike-LFP selection latencies in A1 was not observed.
On the other hand, spike selection latency in LB was comparable to the LFP selection latency, although there is a slight variability in this result when one examines the data on individual monkeys (Fig. 5). Monkey1 exhibited the general trend of spike selection latency being longer than LFP selection latency, just as in the case of onset latencies. However, Monkey2 showed slightly shorter spike selection latencies than LFP selection latencies in LB (Fig. 5). Earlier research has established that A1 and LB have strong reciprocal connections (de la Mothe et al., 2006; Hackett, 2011). Together, these data raise the possibility that LB has a top-down preparatory role for selection-related processing, whereas A1 is primarily involved in bottom-up gating of sensory signals.
Future directions and limitations
Our study provides a design-analysis framework to support neurophysiological findings that could help address questions related to functional networks at both local area-specific scales and global interareal scales. Such studies would shed light on task-specific network mechanisms underlying complex behavior. One limitation of the current study is that it ignores the information about the endogenous neural states present in ongoing oscillations and how such processes affect extrinsic stimulus driven processing. A recent study has shown that neuronal areas separated across large distances whose activities are coherent may also exhibit lower latencies in information processing using AccLLR (Wong et al., 2016). The same framework could also be adapted to detect the timing of oscillatory-response onsets and phase differences from the electrical activity of nearby and distant populations. Finally, AccLLR can be applied to macroscopic neural recordings such as electroencephalogram (EEG), intracranial EEG, and magnetoencephalogram (MEG) to estimate network mechanisms and thereby inform a wider research community.
Acknowledgments
Acknowledgements: A.B. thanks Prof. Neeraj Jain (National Brain Research Center, India) for helpful comments. We thank Dr. Brian Scott for useful comments and for a detailed reading of an earlier version of this paper.
Footnotes
The authors declare no competing financial interests.
This work was supported by the Department of Biotechnology, Government of India Grants BT/RLF/Re-entry/31/2011 and BT/07/IYBA/2013 (to A.B.), by the National Institutes of Health/National Institute of Neurological Disorders and Stroke Grant R01 NS052494 and the National Science Foundation Grant PIRE OISE-0730255 (to J.P.R.), and by the Intramural Research Programs of National Institute of Mental Health (M.M. and Y.K.) and National Institute on Deafness and Other Communication Disorders (A.B., B.H., and Y.K.).
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.