Abstract
Responding to a stimulus requires transforming an internal sensory representation into an internal motor representation. Where and how this sensory-motor transformation occurs is a matter of vigorous debate. Here, we trained male and female mice in a whisker detection go/no-go task in which they learned to respond (lick) following a transient whisker deflection. Using single unit recordings, we quantified sensory-related, motor-related, and choice-related activities in whisker primary somatosensory cortex (S1), whisker region of primary motor cortex (wMC), and anterior lateral motor cortex (ALM), three regions that have been proposed to be critical for the sensory-motor transformation in whisker detection. We observed strong sensory encoding in S1 and wMC, with enhanced encoding in wMC, and a lack of sensory encoding in ALM. We observed strong motor encoding in all three regions, yet largest in wMC and ALM. We observed the earliest choice probability in wMC, despite earliest sensory responses in S1. Based on the criteria of having both strong sensory and motor representations and early choice probability, we identify whisker motor cortex as the cortical region most directly related to the sensory-motor transformation. Our data support a model of sensory encoding originating in S1, sensory amplification and sensory-motor transformation occurring within wMC, and motor signals emerging in ALM after the sensory-motor transformation.
Significance Statement
This study addresses the fundamental question of where within the neocortex a sensory stimulus representation transforms into a motor response representation during stimulus detection. We recorded and analyzed single unit activity of three cortical regions during a passive whisker detection Go/NoGo task in mice. Using quantitative assessments of sensory, motor, and choice encoding across these regions, we showed that a cortical region traditionally defined as whisker motor cortex is most directly related to the transformation process. In addition, our study shows how sensory and motor signals are amplified and propagated throughout cortex. These findings open up new directions to studying the cellular and circuit mechanisms of sensory-motor transformations.
Introduction
To accomplish goal-directed behavior, the brain selects task-relevant stimuli and outputs the appropriate motor responses. A crucial component of this process is the transformation of an internal representation of a sensory stimulus into an internal representation of a motor response. Identifying where this occurs is an essential first step in developing mechanistic understandings of this process. Correlates of sensory-motor transformations in neocortex have been identified in non-human primates (Kim and Shadlen, 1999; Shadlen and Newsome, 2001; de Lafuente and Romo, 2006; Siegel et al., 2015). More recent efforts are now underway to study sensory-motor transformations in mouse neocortex (Matyas et al., 2010; Guo et al., 2014; Zagha et al., 2015; Goard et al., 2016; Yang et al., 2016; Le Merre et al., 2018; Pho et al., 2018; Mayrhofer et al., 2019; Aruljothi et al., 2020; Salkoff et al., 2020), which benefits from less neocortical arealization and the application of novel genetic and physiological tools. However, despite these efforts, there is still no agreement on the location of the sensory-motor transformation.
In this study, we use two major criteria for localizing the site of transformation in mouse neocortex. Our first criterion is the coexistence of robust sensory and motor representations. This has been elegantly demonstrated in the primate lateral intraparietal (LIP) cortex during a visual discrimination task; early in the decision process LIP neurons encode sensory stimulus strength whereas late in the decision process this activity converges to the anticipated response (Roitman and Shadlen, 2002). Regions with only sensory or only motor representation could be upstream or downstream, respectively, of the transformation process, but cannot mediate the transformation.
Our second criterion is early and robust choice probability (Britten et al., 1996; de Lafuente and Romo, 2006; Crapse and Basso, 2015). Choice probability is a measure of the relationship between neural activity and a behavioral response, independent of stimulus content (Britten et al., 1996). For identical stimulus and behavioral conditions, choice probability is significant only after the initiation of the transformation process. Notable primate studies using multisite recordings during sensory-motor task performance compared the onset and magnitude of choice probability across multiple cortical regions (de Lafuente and Romo, 2006; Siegel et al., 2015). Regions showing early and robust choice probability are more likely to be initiating the transformation; conversely, regions showing late choice probability are likely reflecting transformations that occurred elsewhere.
We studied a sensory-motor transformation in the context of sensory detection, in which mice learned to respond (lick) following the presence of a transient whisker deflection stimulus. In a recent study using widefield calcium imaging of dorsal neocortex, we identified the following regions as potentially contributing to the transformation process by expressing robust activity between stimulus onset and response: whisker representation of primary somatosensory cortex (S1), whisker region of primary motor cortex (wMC), and anterior lateral motor cortex (ALM; Aruljothi et al., 2020). Previous studies of similar sensory-motor pairings (whisker stimulus→lick) provide partial support for the transformation occurring within each region. S1 shows robust sensory encoding (Stüttgen and Schwarz, 2008; O'Connor et al., 2010; Wang et al., 2012; Sachidhanandam et al., 2013), can evoke motor responses (Matyas et al., 2010), and displays significant choice probability (Sachidhanandam et al., 2013; Yang et al., 2016; Chen et al., 2016; Kwon et al., 2016). wMC shows robust sensory and motor encoding (Ferezou et al., 2007; Huber et al., 2012; Zagha et al., 2015) and displays neural dynamics consistent with linking a sensory stimulus to a motor response (Zagha et al., 2015). ALM shows robust motor encoding (Li et al., 2015; Chen et al., 2017) and displays neural dynamics consistent with motor planning (Inagaki et al., 2018). Moreover, acute perturbation of all three regions impairs whisker detection (Huber et al., 2012; Guo et al., 2014; Zagha et al., 2015; Yang et al., 2016). However, previous studies have not compared sensory-related, motor-related, and choice-related content across all three regions in the same task. Moreover, it is critical that such studies are conducted with sufficient temporal resolution to determine the precise timing of these signals in each region.
In this study, we measured single unit spiking activity in S1, wMC, and ALM during a whisker detection task. Based on analyses of sensory and motor encoding and choice probability, we find that activity in wMC is most correlated with a sensory-motor transformation process.
Materials and Methods
Subjects
Animals and experiments were approved by the IACUC of University of California, Riverside. Both male and female, adult mice were used in the experiments, of C57BL/6J or BALB/c backgrounds (age: mean ± standard deviation (SD): 145 ± 45 d old at the time of recording experiments). The mice were kept in a 12/12 h light/dark cycle, and the experiments were conducted predominantly during the light cycle.
Surgery
Mice were anesthetized using an induction of ketamine (100 mg/kg) and xylazine (10 mg/kg) and maintained under isoflurane (1–2%) anesthesia. A 10 × 10 mm portion of the scalp was removed and a lightweight metal headpost was attached to the skull using cyanoacrylate glue. The headpost includes an 8 × 8 mm central window, leaving the skull over dorsal cortex exposed. The exposed skull was sealed with a thin layer of cyanoacrylate glue and covered with silicone gel. Mice were treated with meloxicam (0.3 mg/kg) and enrofloxacin (5 mg/kg) on the day of the surgery and for two additional days after the surgery. After recovery from surgery for a minimum of 3 d, water restriction was initiated, and the mice were introduced to the behavioral task.
Behavior
MATLAB software and Arduino boards were used to control the behavioral task flow. The mice were head-fixed in the setup during a behavioral session. Piezoelectric benders with attached paddles were placed within the whisker fields bilaterally. One side was assigned as target and the other as distractor at the onset of training and remained consistent throughout training and recording. The location of the paddles was in the mid-ventral whisker fields (targeting D2/E2-D3/E3 whiskers), with movement in the caudal direction of 1 mm for our largest stimuli. A voltage generator (Thorlabs) was used to drive the piezo benders. Whisker deflections were triangular waves of 2–200 ms. Amplitude and velocity of deflections were varied to operate within the dynamic range of each mouse’s psychometric curve. In any single recording session, two stimulus amplitudes were applied: one near the saturation of the psychometric curve and one 2× or 4× lower within the dynamic psychometric range. For each session, equal strength stimuli were presented for target and distractor trials. Licking responses were detected by an infrared beam break circuit positioned immediately in front of a central lickport. Reward was ∼5 μl of water. Mice were trained in three stages, progressing from (1) classical condition to (2) operant conditioning to (3) the full task with punishment for incorrect responses (for training details and learning trajectories, see Aruljothi et al., 2020). Intertrial intervals (ITIs) varied from 6 to 10.5 s, drawn from a decreasing exponential distribution, to correct for an expectation hazard function and thereby minimize a timing strategy (Elithorn and Lawrence, 1955). Trial types consisted of target trials (deflection of the target paddle), distractor trials (deflection of the distractor paddle) or catch trials (no stimulus). The initial percentages of each trial-type were set as follows: target trials 15%, distractor trials 60%, catch trials 25%. Immediately following stimulus onset was a lockout period of 200 (46 sessions) or 300 ms (eight sessions). Licking during the lockout period resulted in aborting the current trial. Following the lockout period was a 1-s response window. Responses within the response window following target stimuli (hits) were rewarded with a fluid reward. Withholding (not responding) on a distractor trial was rewarded with a shortened ITI (1.4- to 3.1-s distribution) and subsequent target trial. In expert mice, the percentage of target and distractor trials were similar across each session. All licking outside the posttarget response window (including during the ITI) was punished by resetting the ITI. Behavioral sessions typically lasted between 1 and 2 h, which included 200–400 trials. Mouse weights were maintained above 85% of their initial weights by either receiving all the water from task or receiving additional water and wet food after the task.
Engagement period
For behavioral and recording analyses, the trials were truncated to engaged periods using a gap of 60 s as a disengagement criterion. For sessions with more than one engaged period, the longest continuous bout was used for further analyses. Sessions without continuous engagement for 10 min were excluded from further analysis. We also excluded the trials in which the mice responded prematurely (licking during the lockout period).
Behavioral analysis
For behavioral metrics, hit rate was obtained by dividing the number of hits by the total number of target deflection trials. Spontaneous rate was obtained by dividing the number of catch trials containing spontaneous licking during the equivalent response window by the total number of catch trials. For the sessions that did not contain catch trials (n = 10 out of 54 sessions), the 1-s prestimulus response rate was used as a replacement of the spontaneous rate. For the purpose of d-prime calculations, response rates of 0 and 1 were estimated at 0.01 and 0.99, respectively. Behavioral d-prime and criterion were calculated as following (Swets, 1961):
where is the inverse ϕ function which outputs the z score of the input rates. Electrophysiological recordings were conducted immediately after the mice reached expert status. For behavioral performance measures during the electrophysiological recording sessions, see Figure 1D.
Electrophysiology
All of the recordings were obtained from 19 mice. On average, three sessions were recorded from each mouse (range 1–9). Craniotomies and durotomies of <0.5 mm in diameter were established on the day of recording, under isoflurane anesthesia. After 30–60 min postsurgery, mice were tested in the behavioral task without electrode implantation to ensure recovery to normal behavior. Upon evidence of normal expert behavior, a silicon probe (Neuronexus A1x16-Poly2-5 mm-50s-177) was advanced into the brain using a Narishige micro-manipulator under stereoscope guidance. Recording sites were targeted to the barrel field of S1, wMC, and ALM, based on the functional mapping studies of Aruljothi et al. (2020). Precise coordinates (mm, from bregma): S1 3.2–3.7 lateral, 1–1.5 posterior; wMC 0.5–1.5 lateral, 1–2 anterior; ALM 1–2 lateral, 2–2.5 anterior. We positioned the recording sites to target layer 5, from 500 to 1000 μm below the pial surface (midpoint of the silicon probe recording sites, mean ± SD, S1: 650 ± 68 μm; wMC: 647 ± 145 μm; ALM: 692 ± 84 μm).
Whisker alignment for S1 recordings was verified by two methods. First, after electrode implantation we verified correct alignment by hand mapping of several individual whiskers and observing LFP responses. Second, we only included sessions with clear peaks in the combined multiunit poststimulus time histogram (peak response >1.4× above baseline within 40 ms poststimulus). In contrast, inclusion of wMC and ALM sessions were based solely on anatomic location. For wMC, we targeted our recordings to the subregion that displays the earliest onset sensory responses (Matyas et al., 2010), which correlates with anatomic projection sites from whisker primary somatosensory (barrel) cortex at the transition zone between agranular medial and agranular lateral cortices (Smith and Alloway, 2013).
Electrophysiology preprocessing and spike sorting
Neuralynx software was used for data acquisition and spike sorting. Electrophysiological signals were sampled at 32 kHz; wideband signals were bandpass filtered from 0.1 to 8000 Hz, and signals for spike sorting were additionally high pass filtered at 600–6000 Hz. Putative spikes were identified as threshold crossings over 20–40 μV, set at the beginning of each recording session to be well isolated from baseline noise. Spike sorting and clustering was done offline using KlustaKwik algorithm in SpikeSort3D software. The clusters were further manually inspected and merged based on the similarity of waveform and cluster location in peaks and valleys feature space; clusters indicative of movement artifacts (non-spike waveform, equal amplitude in all channels) were removed. We used isolation distance (ID) and L ratio to verify cluster quality (Schmitzer-Torbert et al., 2005; mean ± SEM: ID 15.6180 ± 0.7378, L ratio 0.2312 ± 0.0172). Clusters were rejected if the spike rate was lower than 0.1 Hz. The number of units included in each recording session (mean ± SD): S1: 18 ± 5, wMC: 26 ± 4, ALM: 24 ± 4. Further data analyses were conducted using MATLAB software (MathWorks). Spike times were binned within 5 ms non-overlapping bins. Reaction time (RT) binning in Figure 2 used the following bins: (ms) sensory: fast RT 201–249, medium RT 250–330, slow RT 335–547; sensory-motor: fast RT 322–374, medium RT 374–447, slow RT 459–982; motor: fast RT 301–326, medium RT 329–399, slow RT 407–1240.
Sensory encoding
Sensory encoding was quantified using a neurometric approach based on signal detection theory that enables the direct comparison of neural performance to behavioral performance (Britten et al., 1992; Stüttgen and Schwarz, 2008) Figure 3. For this analysis, the target and distractor trials were used regardless of their outcome (hits, misses, false alarms, and correct rejections). Data presented for sensory encoding used the larger of the two stimuli for neurometric and psychometric comparisons. “Stimulus present” data were spike counts within 100 ms immediately poststimulus; “stimulus absent” data were spike counts within three consecutive 100-ms epochs prestimulus onset. Distributions based on single trials were compared using receiver operating characteristic (ROC) curve, by plotting the cumulative distribution function of each distribution against each other. The area under the ROC (AU-ROC) was converted to neurometric d-prime as following (Simpson and Fitter, 1973):
AU-ROC was bounded by 0.003 and 1–0.003 for population encoding, to ensure the output of real numbers.
Combining units
In Figure 4, sensory encoding was calculated not only for single units, but also for different combinations of units. In Figure 4B, set 3, the spikes were summed together for each 5-ms bin across all the units recorded in a session. This results in a single multiunit per session, for which sensory encoding was calculated similar to single units. In Figure 4B, set 4, the spikes of all units in each region were combined across all sessions. To equalize the number of trials across sessions, the sessions with less trials had their trials duplicated and appended to the original trials to match the trial number of the session with the most trials; for the sessions with the trial numbers not a common divisor of the trial number in the longest session, the trials were randomly sampled (with replacement) from these sessions accordingly and added to that session to fill in. The sensory encoding for these combined units and trials were calculated similar to the previous cases.
Random sampling
In addition to combining all units from each session or region, we ran additional analyses to assess encoding for random sets of units (Fig. 4C,D). We randomly selected units to be added sequentially and computed d-prime values for each group, with group size spanning 1 to total number of units per region. We permutated this ordering and d-prime calculation 300 times and plotted the mean ± SD curve in Figure 4C. For the purpose of neurometric-psychometric comparison, we transposed the data by creating a histogram in d-prime bins (bin width of 0.02 spanning 0–4.5), with the entries (dependent variable) as the neuronal pool size. Mean ± SD for the neuronal pool size needed to achieve a specific d-prime is plotted in Figure 4D.
Sensory-motor alignment
A common method used to assess sensory and motor content is to determine the temporal alignment of neural activity to stimulus and response onsets (Mountcastle et al., 1975; Hanes and Schall, 1996; Romo et al., 2002). To quantify sensory and motor content, we used a similar neurometric approach as described above. Because the motor alignment requires responding to a stimulus, we only considered hit trials in this analysis. For sensory alignment, we used the same 100-ms poststimulus window as for sensory encoding. For motor alignment, we used a 100-ms preresponse window. Both conditions were compared with the same prestimulus baseline as described above.
Latency estimation
Latencies of activation after the stimulus onset was estimated by using a 20-ms sliding window (75% overlap) poststimulus, comparing to a prestimulus baseline, for all target trials. Baseline activity was the average activity in 20-ms sliding windows (75% overlap), during the 1-s prestimulus epoch. We excluded the first 10 ms after the stimulus onset because of possible contamination with stimulus artifacts.
Choice probability
Choice probability was calculated as the separation of neural activity on hit versus miss trials. All spikes from each session were combined to increase spike density for comparisons. To ensure an adequate number of trial types and ensure valid comparisons: (1) we calculated the hit rate for small and large amplitude stimuli separately (2) if the difference between those was below 15%, trials from both types of stimuli were pooled together (3) if the difference was above 15%, the stimulus type with larger number of hits and misses were considered (4) all sessions with fewer than five misses were removed. The total number of trials used for each session are (max/mean/min) 89/24.9/5 (hits) and 102/22.7/5 (misses). We used the AU-ROC method along sliding time windows to calculate choice probability as the separation between spiking distributions on hit versus miss trials. The duration of the sliding window was set to 50 ms with 90% overlap.
General statistics
We used permutation statistics for comparing sensory-motor variance and slope differences (10,000 repetitions). We shuffled the units between the conditions (for instance, S1 and wMC sensory encoding d-primes), and we pooled two new putative sets and calculated the difference in variable of interest (for instance, variance). Then we assessed the position of the actual variance difference among these 10,000 repetitions and we reported the p value as the proportion of the repetitions above the actual variance (two-sided calculation). For each repetition, the shuffling was done by randomly sampling from each condition, with replacement, then mixing the samples. The new putative sets were set to have equal numbers of samples from each condition (half of each condition’s initial size). For comparing random sampling results, Cohen’s d was used by dividing mean difference of the two groups by their pooled standard deviation. For calculating significant choice probability within each region across sessions, for each time window, we calculated a one sample t test between the reported choice probability and chance level (0.5; p = 0.01). For comparing choice probability amplitudes across regions, we used unpaired t tests with an α level of 0.01. For latency estimation, paired t test was used (between each 20-ms window and baseline) with α level of 0.05. Data are presented as mean ± SEM unless otherwise indicated.
Results
Behavioral task and electrophysiological recordings
Head-fixed mice were trained to perform a whisker detection go/no-go task in which they learned to lick a lickport following a transient whisker deflection in one whisker field (target) to obtain a fluid reward (Fig. 1A). Stimuli were piezo-controlled caudal deflections of a paddle contacting multiple whiskers. We imposed a minimum lockout period of 200 ms between stimulus onset and response window to separate sensory from motor encoding. Trials were aborted if any responses occurred during the lockout period. Target trials were interleaved with two other trial types: distractor trials, in which there was a transient deflection of the same amplitude in the opposite whisker field, and catch trials, in which there was no stimulus deflection (Fig. 1B,C). Mice were considered expert in this task once they achieved a detection d-prime (separation between hit rate and spontaneous rate) >1 for three consecutive days. Electrophysiological recordings were conducted in expert mice while performing the detection task. For the recording sessions included in this study, the behavioral performance measures: hit rate 88.0 ± 2.3%; spontaneous rate 15.7 ± 1.4%; d-prime 2.6 ± 0.1 (n = 54 sessions from n = 19 mice; Fig. 1D).
Based on a concurrent widefield calcium imaging study (Aruljothi et al., 2020), we targeted our electrophysiological recordings to three cortical regions contralateral to the target whisker field: whisker representation of S1, wMC, and ALM. Each of these regions were significantly active poststimulus and preresponse (Aruljothi et al., 2020), and therefore may contribute to the sensory-motor transformation process. We used silicon probes with contact sites spanning layer 5 to record multiple single units in each region (S1: 445 units, 25 sessions, eight mice; wMC: 424 units, 16 sessions, nine mice; ALM: 315 units, 13 sessions, eight mice). To establish the functional hierarchy of these regions, we calculated poststimulus response latency for each session. Latency measurements are consistent with the functional ordering of S1→wMC→ALM (mean ± SD: S1 30 ± 15 ms, wMC 48 ± 28 ms, ALM 95 ± 43 ms; ANOVA F(2,51) = 23.56, p < 0.01; Tukey’s post hoc comparison: S1-ALM p < 0.01, wMC-ALM p < 0.01, S1-wMC p = 0.11).
Justification of “sensory” and “motor” temporal windows
Next, we quantified the sensory and motor content within each cortical region. We used spiking activity within specific time windows to assess putative sensory content (100 ms poststimulus onset) and putative motor content (100 ms preresponse onset). To justify our time windows, we present three example neurons in Figure 2 with robust sensory, sensory-motor, and motor context, respectively. The sensory unit shows robust alignment to the stimulus onset as a sharp peak in the average spiking activity across trials (Fig. 2A, left column). In contrast, this unit lacks a sharp peak in average activity when aligned to the response (Fig. 2A, right column). This is further apparent when grouping the trials based on RTs (Fig. 2A, bottom row): peak activity levels overlap regardless of RT when aligned to stimulus onset, whereas peak activity levels vary according to RT when aligned to the response. Note that this time-locked sensory activity occurs within the first 100 ms poststimulus. On the other hand, the motor unit shows prominent alignment to the response (Fig. 2C, right column) with activity that is delayed when aligned to the stimulus onset (Fig. 2C, left column). In further contrast with the sensory unit, peak activity levels in the motor unit overlap when aligned to the response but not to the stimulus onset (Fig. 2C, bottom row). Note that this time-locked motor activity peaks within the last 100-ms preresponse. The sensory-motor unit shows a mixture of both features, with sharp, transient activity aligned to the stimulus followed by activity that is sustained until the response (Fig. 2B).
Single unit and population sensory encoding across cortical regions
We show in Figure 3A the average spiking activity for an example unit from target-aligned S1, contralateral to the target whisker field. On target trials (Fig. 3A, blue), this unit displayed a prominent increase in spiking immediately after stimulus presentation, followed by a lower level of persistent activity. Spiking activity on distractor trials (Fig. 3A, black), in contrast, appeared only slightly elevated from prestimulus levels. In order to quantify stimulus encoding, we used the neurometric d-prime approach (Fig. 3B–E), which accounts for single trial variability and allows for comparisons between neuronal performance and behavioral performance (Britten et al., 1992). We compared trial by trial distributions of prestimulus and poststimulus spiking activities (Fig. 3B,C). For the poststimulus condition, we included spikes in the first 100 ms poststimulus onset. We calculated the d-prime value of each unit from area under the curve (AUC) of the ROC function (AU-ROC) between the prestimulus and poststimulus distributions (Fig. 3D). For this analysis, a d-prime greater than zero indicates higher spiking activity poststimulus compared with prestimulus. Plotting the d-prime values for all units in this S1 recording session (Fig. 3E) shows target versus distractor stimulus encoding across the population. As shown in this example session, target stimulus encoding is highly variable yet positively skewed across these units, whereas distractor stimulus encoding is considerably more restricted.
Shown in Figure 4A is target and distractor stimulus encoding for S1, wMC, and ALM across all recorded neurons, indexed to the average behavioral performance of the mice during the corresponding recording sessions. Both S1 and wMC neurons showed prominent target stimulus encoding across their populations. ALM neurons, in contrast, showed minimal target stimulus encoding. We analyzed these data with both single unit and population approaches (Fig. 4B–D). First, we compared the group means of single unit target encoding across these three regions (Fig. 4B, set 2). We found mean target stimulus encoding to be significantly higher for S1 and wMC compared with ALM, and, interestingly, for wMC to be significantly higher than S1 (S1: 0.23 ± 0.02; wMC: 0.37 ± 0.03; ALM: 0.03 ± 0.01; ANOVA F(2,1181) = 47.036, p < 0.01; Tukey’s post hoc comparison: S1-ALM p < 0.01, wMC-ALM p < 0.01, wMC-S1 p < 0.01; effect size: wMC 61% larger than S1). Additionally, we compared the summed spiking from multiple neurons in each trial [summed within each recording session (Fig. 4B, set 3) and summed across all units within each region (Fig. 4B, set 4)]. When combined across each population, the neurometric d-prime for S1 and wMC, but not ALM, outperformed the behavioral d-prime.
To quantify population coding within each region, we randomly sampled different numbers of units in each region and plotted the resulting neurometric target d-prime values (see Materials and Methods; Fig. 4C). As the number of sampled units increased, the target d-prime increased well beyond behavioral performance for S1 and wMC, but not for ALM. Furthermore, this trend rose faster for wMC than for S1 (Fig. 4C). To perform neurometric-psychometric comparisons, we first transformed the data across axes (Fig. 4D). This allowed us to assess, for each region, the mean and variance in neural pool size that outputs each d-prime value. Next, we determined the neural pool size needed to match the average behavioral d-prime values from the same sessions. Both S1 and wMC populations were able to match behavioral performance (Fig. 4D, red arrows), whereas the ALM population was not. Moreover, fewer units were required to match behavioral performance for wMC compared with S1 (mean ± SD: S1 95 ± 44 units; wMC 52 ± 25 units; Cohen’s d = 1.2).
We also used an additional method to quantify sensory encoding. Instead of using a fixed 100-ms window, we replicated the above analyses for a 20-ms window of peak sensory encoding for each recording session. The peak window analyses also demonstrated larger single unit d-prime values in wMC compared with S1 (average d-prime: S1: 0.16, wMC: 0.22, p < 0.01, Tukey’s post hoc comparison) and fewer wMC neurons needed to match behavioral performance (mean ± SD: S1 199 ± 84 units; wMC 121 ± 56 units; Cohen’s d = 1.0). Altogether, these results demonstrate robust sensory encoding in S1 and wMC but not in ALM, with increased sensory encoding in wMC compared with S1.
Sensory and motor alignments across cortical regions
Next, we sought to assess the sensory versus motor alignment across these three cortical regions. For quantification, we used a similar neurometric d-prime method as above, yet for only hit trials and for both sensory and motor alignments (Fig. 5). For sensory alignment, we again analyzed spiking within 100 ms following stimulus onset; for motor alignment, we analyzed spiking within 100 ms preceding the RT (Fig. 5A). Because of our imposed lockout between stimulus onset and response window, these analysis epochs did not overlap. In Figure 5B, we plot the sensory versus motor alignment for each unit across all three regions. Interestingly, much of the variance of the S1 and wMC populations lies along the diagonal, indicating equal sensory and motor alignment in these regions. In contrast, the variance of the ALM population is largely along the x-axis, indicating predominant motor alignment. We quantified this by calculating a sensory-motor variance ratio: variance along sensory axis divided by variance along motor axis. Indeed, this variance ratio was similar for S1 and wMC, and both were significantly larger than ALM (variance ratio: S1 = 0.90, wMC = 0.86, ALM = 0.09; permutation statistics, S1-wMC, p = 0.79, S1-ALM, p < 0.01, wMC-ALM, p < 0.01). We also used the mean and variance of each alignment as measures of representation and compared these values within and between regions (Fig. 5C,D, respectively). Similar to the findings depicted in Figure 4, we found that sensory representation increased from S1 to wMC, then fell dramatically in ALM. Additionally, we found that motor representation increased from S1 to wMC and ALM, with similar means and variance in wMC and ALM.
The above analyses support the observation that S1 and wMC show both sensory-aligned and motor-aligned content, and therefore meet our first criterion for identifying the location of the sensory-motor transformation. ALM, in contrast, shows only motor-aligned content, which we interpret as being downstream of the transformation process.
Choice probability across cortical regions
To determine the temporal onset of activity related to the sensory-motor transformation, we calculated choice probability across time for each of the three regions. Choice probability quantifies the separation between hit and miss trials, thereby isolating response-related activity (Britten et al., 1996; de Lafuente and Romo, 2006; Crapse and Basso, 2015). According to our second criterion, the region with early and robust choice probability is most likely to initiate the sensory-motor transformation. For these analyses, we combined spikes from all units within each recording session to enhance spike density per comparison. Figure 6 shows the average spiking activity on hit and miss trials from three example sessions (Fig. 6A–C) and across all recording sessions (Fig. 6D–F). All panels show higher activity on hit trials during some portion of their poststimulus response, indicating positive choice probability. However, there are notable differences between regions. The S1 and wMC data show increased activity on hit trials immediately poststimulus and during the response window. However, the separation of hit and miss activity appears to be larger and more sustained for wMC. In contrast, the ALM data show poststimulus responses only on hit trials, which emerges gradually after stimulus onset.
We calculated choice probability in 50-ms sliding windows across sessions for each region (Fig. 7A). All three regions showed significant increases in choice probability poststimulus (one-sample t test, comparing to chance level at 50% and α level of 0.01; Fig. 7B, gray bars), indicating higher spiking rate on hit trials. Interestingly, S1 additionally showed significant negative choice probability prestimulus (Fig. 7B, purple bars), indicating that lower spike rates immediately before the stimulus onset predicts a hit response. For all three regions, significant poststimulus choice probability preceded the RT, which was always >200 ms because of our lockout period. However, significant poststimulus choice probability emerged earliest in wMC compared with S1 and ALM (S1 165 ms, n = 21 sessions; wMC 70 ms, n = 13 sessions; ALM: 175 ms, n = 9 sessions). Notably, choice probability latencies are not merely reflections of neural activity latencies of these regions (Fig. 7B, red bars); while stimulus response latency was earliest in S1, choice probability emerged earliest in wMC.
To compare amplitude and time course, in Figure 8, we overlay choice probability signals from all three regions. After stimulus onset, choice probability rose faster in wMC compared with S1 and ALM (Fig. 8A). We further assessed differences in choice probability magnitude in each time window by conducting pairwise comparisons between regions (Fig. 8B). Choice probability was significantly larger during the poststimulus lockout window in wMC compared with S1 and ALM (two-sample t test, α level of 0.01). Based on these analyses, wMC meets our second criterion for identifying the location of a sensory-motor transformation, in displaying early onset and robust choice probability.
Discussion
The focus of this study is to localize within neocortex the region most directly related to the sensory- motor transformation process. This was studied in a whisker detection task, in which mice were trained to respond to passive whisker deflections by licking a central lickport. Our recordings within the neocortex focused on three regions which have been identified in a recent calcium imaging study (Aruljothi et al., 2020) as potentially contributing to the transformation. Our analyses indicate wMC as the cortical region most directly related to the transformation processes based on having the strongest sensory encoding (Fig. 4), robust sensory and motor alignment (Fig. 5) and early and robust choice probability (Figs. 7, 8). Our findings are consistent with sensory integration occurring between S1 and wMC, sensory-motor transformation occurring within wMC, followed by the propagation of motor signals in ALM.
Choice encoding initiating downstream of primary sensory cortices has been demonstrated in studies of non-human primates (Romo et al., 2002; de Lafuente and Romo, 2006; Siegel et al., 2015) and studies of visual detection/discrimination in mouse (Goard et al., 2016; Pho et al., 2018; Salkoff et al., 2020). Our study is also consistent with this finding. However, our study and other studies of the mouse whisker system show significant choice encoding in S1 as well (Sachidhanandam et al., 2013; Kwon et al., 2016; Yang et al., 2016; Aruljothi et al., 2020). Choice encoding in S1 consistently occurs “late,” after the initial feedforward sensory peak activity (Sachidhanandam et al., 2013; Fig. 8). Our findings do not support S1 as initiating the sensory-motor transformation (Figs. 7, 8). We consider two possible causes for S1 choice encoding. First, S1 choice encoding may reflect feedback from choice signals originating in higher order cortices, such as wMC or S2 (Kwon et al., 2016; Yang et al., 2016), as has been described in non-human primates (Siegel et al., 2015). Alternatively, S1 choice probability measurements may not relate to choice encoding at all, but instead reflect re-afferent signals related to the behavioral response sequence. In a related study of the same task, we found whisking to increase ∼100 ms after stimulus onset, which preceded the onset of licking by ∼100 ms (Aruljothi et al., 2020). We report here significant choice probability in S1 at 160 ms, 60 ms after the onset of whisking. Since whisking is largely absent on miss trials (Aruljothi et al., 2020), re-afferent signals likely contribute to measures of S1 choice probability. In contrast, we find significant choice probability in wMC at 70 ms, 30 ms before the onset of whisking. Additionally, we find significant choice probability in ALM at 175 ms, 25 ms before the onset of licking. These neural and behavioral temporal latencies are consistent with the choice-related signals in wMC and ALM initiating the whisking and licking response sequence, respectively. wMC is a frontal region traditionally studied in the context of whisking initiation and modulation (Carvell et al., 1996; Kleinfeld et al., 1999; Hill et al., 2011). However, it is now certain that wMC has additional functions related to whisker sensory processing. wMC receives whisker sensory inputs (Farkas et al., 1999; Kleinfeld et al., 2002; Ferezou et al., 2007; Chakrabarti et al., 2008). In one study, sensory representations in wMC better matched perceptual reports than sensory representations in S1 (Fassihi et al., 2017). wMC may also mediate sensory selection, by attenuating the propagation of distractor stimuli (Aruljothi et al., 2020). The current study proposes an additional function of sensory-motor transformation, potentially mediated by winner-take-all dynamics in converting a transient, sensory stimulus into a sustained, motor response (Zagha et al., 2015).
We recognize that it is highly unlikely that the sensory-motor transformation occurs exclusively within neocortex. In particular, we suspect that, in our task, interactions between neocortex and striatum are essential for action selection and initiation (Frank, 2011). The question then is, what are the specific contributions of wMC to the transformation process? First, we propose that wMC contributes to sensory integration. An unexpected finding in this study is that sensory encoding is enhanced in wMC compared with S1. This finding is based on a larger average neurometric d-prime of wMC neurons and fewer wMC neurons required for the neurometric d-prime to match the psychometric d-prime of the same behavioral sessions (Fig. 4). This enhancement may occur by summing the spiking activity of random sets of S1 neurons, as simulated in our pooling analysis. Thus, a general function of wMC may be to integrate whisker sensory responses. The integrated sensory representations within wMC, rather than S1, may reflect the “decision variables” that ultimately drive behavior (Gold and Shadlen, 2007).
Less clear, however, are the contributions of wMC to response initiation. Previous studies that have suppressed this region and neighboring regions during sensory-motor tasks have reported variable effects on hit rates, but significant increases in false alarm rates (Narayanan and Laubach, 2006; Huber et al., 2012; Zagha et al., 2015; Goard et al., 2016; Kamigaki and Dan, 2017) or non-significant trends toward increased false alarm rates (Le Merre et al., 2018; Mayrhofer et al., 2019). In contrast to wMC suppression, S1 suppression consistently results in reduced hit rates (O'Connor et al., 2010; Miyashita and Feldman, 2013; Zagha et al., 2015; Le Merre et al., 2018; Mayrhofer et al., 2019). Thus, for wMC, we note an apparent contradiction between our neural recording data and these causal studies. We report strong positive choice probability, suggesting that wMC promotes response initiation. Yet these causal studies suggest that wMC suppresses response initiation. Resolving these contradictory findings is an important focus of future research.
Acknowledgments
Acknowledgements: We thank Dr. Hongdian Yang, Krista Marrero, and Krithiga Aruljothi for many helpful discussions throughout the project.
Footnotes
The authors declare no competing financial interests.
This work was supported by the Whitehall Foundation Research Grant 2017-05-71 (to E.Z.) and the National Institutes of Health Grant R01NS107599 (to E.Z.).
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.