Abstract
Impulsive individuals excessively discount the value of delayed rewards, and this is thought to reflect deficits in brain regions critical for impulse control such as the anterior cingulate cortex (ACC). Delay discounting (DD) is an established measure of cognitive impulsivity, referring to the devaluation of rewards delayed in time. This study used male Wistar rats performing a DD task to test the hypothesis that neural activity states in ACC ensembles encode strategies that guide decision-making. Optogenetic silencing of ACC neurons exclusively increased impulsive choices at the 8 s delay by increasing the number of consecutive low-value, immediate choices. In contrast to shorter delays where animals preferred the delay option, no immediate or delay preference was detected at 8 s. These data suggest that ACC was critical for decisions requiring more deliberation between choice options. To address the role of ACC in this process, large-scale multiple single-unit recordings were performed and revealed that 4 and 8 s delays were associated with procedural versus deliberative neural encoding mechanisms, respectively. The 4 and 8 s delay differed in encoding of strategy corresponding to immediate and delay run termination. Specifically, neural ensemble states at 4 s were relatively stable throughout the choice but exhibited temporal evolution in state space during the choice epoch that resembled ramping during the 8 s delay. Collectively, these findings indicate that ensemble states in ACC facilitate strategies that guide decision-making, and impulsivity increases with disruptions of deliberative encoding mechanisms.
Significance Statement
Investigating the neural basis of impulsivity has broad implications, from understanding decision-making to treating psychopathology. The role of the anterior cingulate cortex (ACC) in mediating impulsivity during delay discounting (DD) remains unclear; however, increasing evidence suggests a role in guiding decision-making strategies. Here we report that disrupting ACC activity when rats are required to use a deliberative decision-making strategy increases impulsivity. These findings provide evidence that ACC is uniquely important for guiding deliberative decision-making during DD.
Introduction
Delay discounting (DD) is a phenomenon that describes the tendency for temporally delayed rewards to lose value, and the extent of devaluation is an indication of cognitive impulsivity (Wittmann and Paulus, 2008). DD is an evolutionarily conserved behavior that has been observed in every species examined to date (Vanderveldt and Green, 2017). Importantly, extreme DD (i.e., high impulsivity) is a characteristic feature of several psychiatric disorders (see Peterson et al., 2015 for review). Therefore, identifying neural activity patterns that underlie decisions during DD has broad implications, from understanding decision-making to treating psychopathology.
Using strategies to guide decision-making can reduce impulsivity (Linsenbardt et al., 2017; Sweis et al., 2018). Strategy refers to a set of rules used to guide behavior during decision-making and can be identified by differences in choice patterns, such as alternating between choice options (Rich and Shapiro, 2009; Bissonette and Roesch, 2015; Hasz and Redish, 2018). Decisions influenced by DD are hypothesized to arise from the interaction of procedural and deliberative decision-making strategies, each of which have their own computational processes and neural correlates (Van Der Meer et al., 2012; McLaughlin and Redish, 2023). In sum, we define deliberative decision-making here as an effortful process to compare the value of available options, whereas procedural decision-making employs rules or strategies that limit the need for deliberation.
At both the psychological and neural level, deliberation has been conceptualized as a process where evidence for the best outcome is accumulated and terminates once a decision is made (Gold and Shadlen, 2007; Papale et al., 2012). The accumulation of information has been hypothesized to be facilitated by “ramping” activity in single neurons, where firing rates progressively increase up to the decision point (Maoz et al., 2013; Jahans-Price et al., 2014; Waskom and Kiani, 2018). In a DD task, when deliberating between immediate and delayed options, the value of the delayed option needs to be considered at the time in the future when it will be received (J. Peters et al., 2010; Kurth-Nelson et al., 2012; Papale et al., 2012). Deliberation therefore requires cognitive effort, unlike procedural decision-making which is guided by pre-established evidence (for reviews, see Van Der Meer et al., 2012; McLaughlin et al., 2021). In tasks that have a repeated, consistent trial structure, a procedural decision-making strategy is advantageous as it requires less cognitive effort by limiting the need to evaluate the options during each trial (Brouwer et al., 2010; McLaughlin and Redish, 2023). Therefore, determining how strategies impact impulsivity can provide insight into the mechanisms that underlie decision-making during DD.
Impaired function of the PFC is thought to contribute to high impulsivity, and, supporting this view, rats with altered anterior cingulate cortex (ACC) function and neurochemistry are both highly impulsive and lack behavioral correlates of strategy (Engleman et al., 2006; Beckwith and Czachowski, 2014; Linsenbardt and Lapish, 2015; Linsenbardt et al., 2017, 2019; De Falco et al., 2021; Timme et al., 2022; McLaughlin and Redish, 2023). However, of the studies that have examined ACC function during DD, mixed effects are observed (Cardinal et al., 2001; Churchwell et al., 2009; Loos et al., 2010; Feja and Koch, 2014; Sonntag et al., 2014; McLaughlin and Redish, 2023). These discrepancies may be attributable to differences in task parameters (e.g., delay duration) or the hypotheses about the role of the ACC in DD that guide the study.
Several studies have assessed the role of the ACC in value encoding as well as more abstract processes during DD. While value representations are observed in ACC, this region likely contributes more abstract processes during decision-making in DD (Laskowski et al., 2016; Sackett et al., 2019). In line with this view, neural correlates of switching between procedural or deliberative decision-making, initiation of deliberation, and the need to change strategy have been observed in ACC (G. J. Peters et al., 2013; Schuck et al., 2015; Powell and Redish, 2016; Schmidt et al., 2019; McLaughlin and Redish, 2023). Together, these data motivate the need to identify computations performed by ACC during DD to understand its role in impulsive decision-making.
In this study, we investigate the hypothesis that neural activity in ACC plays a critical role in limiting impulsive decisions via strategy selection when cognitive effort is required. To test this hypothesis, optogenetics and multiple single-unit neural recordings were accomplished in male rats performing an adjusting amount DD task.
Materials and Methods
Animals
Male Wistar rats were purchased from Envigo for optogenetic inhibition of ACC (n = 8) and awake-behaving electrophysiology in ACC (n = 10). Animals were acclimated for 3 d following arrival to the vivarium. A 12 h reverse light/dark cycle with lights off at 7:00 A.M. was utilized. Following acclimation, animals were single housed and given at least a week prior to testing. Animals were at least 70 d of age prior to testing and had ad lib access to food and water prior to food restriction/habituation. Animals were food restricted to 85% of their starting free-feeding weight and maintained under this condition throughout all experiments except immediately prior to and up to 7 d after surgery. All procedures were approved by the IUPUI School of Science Institutional Animal Care and Use Committee and were in accordance with the National Institutes of Health Guidelines for the Care and Use of Laboratory Animals.
Operant apparatuses
Eight standard one-compartment operant boxes (20.3 cm × 15.9 cm × 21.3 cm; Med Associates) inside of sound-attenuating chambers (ENV-018M; Med Associates) were used for both optogenetic and electrophysiology experiments in the habituation and shaping protocols. Each box contained left and right retractable levers on one wall, left and right stimulus lights positioned immediately above each lever, and an easily accessible pellet hopper positioned between these left and right devices. The opposite wall contained a house light and a tone generator (2,900 Hz) on the topmost position. One custom-built operant box (21.6 cm × 25.7 cm × 52.0 cm) was used to accommodate all electrophysiological experiments. Dimensions, stimuli (including house and cue lights), and retractable levers were all positioned to replicate the conditions of the standard operant boxes as closely as possible. The floor bars of the custom-built box were made of wood polls rather than metal and all metal components of the box were covered in a powder coating to reduce artifacts. In addition, two of the eight standard operant chambers were modified for optogenetic inhibition (see below, Stimulation and recording equipment, for additional information).
Behavioral procedures
Following single housing, animals were handled daily for a week. Animals were then habituated to the operant chambers and completed pretraining in the same manner described in previous work from our group (Linsenbardt et al., 2017) prior to beginning the DD task. The within-session adjusting amount DD procedure was a modified version of the procedure performed by Linsenbardt et al. (2017), which was adapted from Oberlin and Grahame (2009). Before beginning the DD task, the immediate and delay levers were assigned for each animal during shaping. Choosing the delay lever always resulted in the delivery of six sucrose pellets following a delay (0, 1, 2, 4, 8, or 16 s), while the immediate (adjusting) lever dispenses 0–6 pellets with no delay. The number of pellets dispensed by the immediate lever (i-value) on the first trial of each session was three pellets. Each delayed choice increased and each immediate choice decreased i-value by one pellet on the subsequent trial.
The sequence of events for a single trial is depicted in Figure 1A, for additional detail (Linsenbardt et al., 2017). Briefly, the back wall contained a house light, which signified the start of a trial and remained illuminated for 10 s. Subsequently, levers extend for the initiation, retract (1 s), and re-extend for the choice epoch. Sucrose pellets were then dispensed into the hopper followed by an intertrial interval. Then, 20 mg sucrose pellets were used for electrophysiology and 45 mg sucrose pellets were used for optogenetic experiments.
“Forced trials” were implemented for the immediate and delay levers, where two consecutive responses on the same lever would result in a forced trial for the nonchosen lever on the next trial (Fig. 1B). If an animal did not lever press for the forced trial, the forced trial would be repeated until the lever was pressed. There was no effect of forced trials on i-value (Fig. 1B).
The session terminated either after 30 choice trials or 35 min for the optogenetic experiment (all delays) and for the 0, 1, and 2 s delays in the electrophysiology experiment (in the standard operant chambers). When animals were moved from the standard operant boxes to the custom operant box for the electrophysiological recordings (4 and 8 s delays), sessions terminated after either 40 choice trials or 45 min using 20 mg sucrose pellets to maximize number of trials obtained while recording. The delays were completed in ascending order (0, 1, 2, 4, 8, 16 s) with a day off in between the start of each new delay. Eight to twelve sessions were given at the 0 s delay and four sessions at the 1 and 2 s delay. Nine sessions were completed at the 4, 8, and 16 s delays to account for optogenetic manipulation (Table 1). Inhibition (Laser ON sessions) occurred every other session following the first two sessions in order to acclimate animals to the new delay. Electrophysiological recordings during the 4 and 8 s delay were obtained until a viable signal was no longer apparent. Magnitude discrimination was determined at the 0 s delay in the standard operant chambers using the 45 mg sucrose pellets before animals were accepted for surgery with an exclusion criterion of 80% (4.8 pellets) of the maximum reward value (six pellets) for optogenetic inhibition and 70% criterion for electrophysiological recordings (4.2 pellets). The average value of the immediate lever over the last 10 choice trials was determined for the last 3 d of the 0, 1, and 2 s delay and was used to determine the indifference point of each animal. Animals that were included then either received surgery for optogenetic or electrophysiology experiments (see below, Surgical preparation and implantation, for detail).
Indifference points for both optogenetic and electrophysiology experiments were obtained by taking the average of the last 10 choice trials of each session for a given delay. For optogenetic inhibition, during delays with optogenetic manipulation (i.e., the 4, 8, and 16 s delays), the last 10 trials of each day for each condition (Laser OFF, Laser ON) were used to determine an indifference point for Laser ON versus OFF conditions at the 4, 8, and 16 s delays. Sessions 1 and 2 were excluded for the Laser OFF condition, as animals were becoming familiar with the new delay and indifferences points were not yet stable. The last 10 trials were taken for each animal for Laser ON and OFF sessions following session two to obtain indifference points. The average of the last 10 trials were used for calculating indifference points for the Laser OFF and Laser ON sessions. The rate of discounting was determined using the Mazur hyperbolic model (Eq. 1; Mazur, 1987):
Surgical preparation and implantation
For all surgeries, animals were placed inside a flow box and anaesthetized with isoflurane gas (2% at 4 L/h) until sedated, at which point they were placed in a stereotaxic frame and maintained on 1–3% isoflurane for the duration of the surgery. Artificial tears were then applied. Subsequently, fur was shaved and the skin at the incision site was sanitized with three rounds of both 70% EtOH and betadine before applying a local anesthetic (Marcaine; 5 mg/kg, s.c.). An anti-inflammatory (Ketofen; 5 mg/kg dose, s.c.) and antibiotic (Cefazolin; 30 mg/kg, s.c.) were injected at the nape of the neck (anti-inflammatory and antibiotic) before beginning the incision. Once the skull was exposed and cleaned of blood, bregma-lambda coordinates were identified. Prior to any implantation (probe or optic fiber), four stainless steel anchoring screws were inserted. Following insertion of either Cambridge Probes or optic fibers, a two-compound dental cement was used to adhere implants to anchoring screws. Following completion of surgical procedures, animals were maintained in a clean heated cage before being returned to the vivarium.
Opsin virus delivery and implantation of optic fibers
Two syringe pumps (Pump 11 Elite; Harvard Apparatus) were attached to each arm of the stereotaxic frame and loaded with 2 μl Hamilton syringes (7002KH, Hamilton). Coordinates for ACC viral injections occurred at a 20° angle and were as follows: +3.2 mm AP, +2.0 mm ML, −5.2 mm DV from the bregma. Holes were drilled into the skull to allow the Hamilton syringes to penetrate the brain tissue. Animals then received bilateral injections of 0.65 μl at a flow rate of 0.2 μl/min of the inhibitory Adeno-associated virus (AAV-CaMKIIa-eArchT3.0-EYFP; K. Deisseroth via UNC Vector Core) followed by 10 min of diffusion before retracting the Hamilton syringes. Subsequently, animals received fiber implantation of dual fiber-optic cannulas with guiding sockets (DFC_200/245-0.37_3.3mm_GS1.4_FLT; Doric Lenses).
Electrophysiological probe implantation
A rectangular craniotomy was performed over the right hemisphere of ACC (AP: 2.8, ML: 0.3 from the bregma) followed by a durotomy and cleaning/hydration of the probe insertion site with a sterile saline solution. Additionally, two ground screws were placed above the cerebellum. A Cambridge NeuroTech F (n = 5), p (n = 4), or E-series (n = 1) 64-channel silicon probe on a movable drive (Cambridge NeuroTech) was lowered to the target site. Mobility of the movable drive was maintained with a coating of antibiotic ointment.
Stimulation and recording equipment
Optogenetic stimulation
A green (532 nm) laser (MGL-FN-532-300mW; Ultralasers) operated through Med Associates Programming via a TTL (Med Associates) was utilized for stimulation. From the fiber coupler, a mono patch cord (MFP_200/240/900-0.22_1m_FC-FC; Doric Lenses) was attached and traversed the sound-attenuating chambers terminating at the rotary joint (FRJ_1 × 1_FC-FC; Doric Lenses) which attached a Branching Fiberoptic Patchcord (BFP(2)_200/240/ARMO-0.22_0.5m_FCM-GS1.4; Doric Lenses) that was the terminal connection to the animal via guiding socket at the top of the animal's skull. Stimulation did not occur in pulses and remained on for the duration of the epoch to prevent rebound depolarization of cells. Stimulation at the tip of the fiber measured ∼21 mW resulting in predicted irradiance of ∼60 mW/mm2 at the fiber tip. Larger irradiance values were opted for in order to traverse the entire region of the ACC with only one fiber per hemisphere.
Optogenetic inhibition (Laser ON) occurred at one of two different epochs during the task for a given session (Epoch 1 inactivation or Epoch 2 inactivation; Table 1 and Fig. 1A). Epoch 1 stimulation occurred from the start of a given trial and terminated once an animal initiated the trial (Fig. 1A, top). Stimulation remained on if the animal omitted initiating the trial until a response on an initiation lever was made. Epoch 2 stimulation occurred as soon as the animal initiated a trial and terminated once a choice was made (Fig. 1A, top). Stimulation remained if the choice was omitted until a choice was made on subsequent trials. Stimulation occurred on the third, fifth, seventh, and ninth session/day of the 4, 8, and 16 s delays to control for carry over effects of the stimulation as well as to obtain indifference points for the Laser OFF condition. All animals received stimulation at both Epoch 1 and Epoch 2 in a cross-over design (Table 1) so that half the animals received Epoch 1 on the third and seventh day and Epoch 2 on the fifth and ninth day and the other half of animals received the opposite configuration. The virus was allowed to express for at least 3 weeks before beginning any optogenetic manipulation.
Electrophysiology equipment
Silicon probes were acquired from Cambridge NeuroTech and interfaced with Omnetics connectors (Omnetics). Silicon electrodes were mounted the day prior to surgery to Cambridge NeuroTech microdrives. An Intan RHD SPI cable (Intan) connected the headstage to a Doric Commutator (Doric Lenses) positioned above the operant apparatus. An Open Ephys (Open Ephys) acquisition system was used to collect all electrophysiological data. ANY-Maze (ANY-maze Behavioral tracking software) was used to collect all behavioral and locomotor data. ANY-maze locomotor data was synchronized with Open Ephys via an ANY-maze AMI connected to an Open Ephys ADC I/O board. Med PC behavioral events were also synchronized to the electrophysiological recordings via an Open Ephys ADC I/O board. Following sessions with diminished signal, electrodes were lowered 50 µm.
Immunohistochemistry, histology, and anatomical nomenclature
Histological verification of virus and optic fiber implant (Fig. 2A) or electrode placements (Fig. 4A) was conducted to exclude animals prior to analysis. The locations of each electrode at the end of the experiment are shown (Fig. 4A). Most of the recording sites and optogenetic expression ended up in what would be considered rat prelimbic cortex. However, each electrode was mounted on a hyperdrive and lowered when recording quality began to diminish. This corresponded to an average of 0.57 mm (range, 0.125–1.125 mm) of total travel in the D/V axis throughout the experiment. Extrapolating from the ending placement this puts the location of several electrodes in the ACC and possibly M2 according to the 6th Edition of the Paxinos and Watson atlas (Paxinos and Watson, 2007). Therefore, we have adopted the homologous nomenclature of ACC anatomy described in van Heukelum et al. (2020) and refer to the brain region we measure and actuate as ACC.
Optogenetics
Animals were perfused within 14 d after behavioral testing with 4% PFA after receiving an anesthetic dose of urethane (1.5–2.0 g/kg). Brains were then fixed in 4% PFA for 24 h before being placed in a 30% sucrose solution (24–72 h) and subsequently stored at −20°C until sliced 50 µm thick. To assess transduction of glutamatergic pyramidal cells within ACC, slices were mounted on gelatin subbed glass slides using an aqueous mounting medium (H-1000-10; Vectashield, Invitrogen). A florescence imaging scope (Nikon Eclipse 80i) was used to verify EYFP-tagged protein expression.
Electrophysiology
Animals were anesthetized with urethane (1.5–2.0 g/kg) and subsequently perfused following with 4% PFA after cessation of spinal reflexes. Following tissue extraction, brains were fixed in 4% PFA for 24 h and then transferred to a 30% sucrose solution for cryoprotection. Following our postfix procedures, tissue was stored at −80°C until tissue was sliced at 50 µm and stained for both GFAP and DAPI. Briefly, tissue sections were washed in phosphate-buffered saline (PBS) once. Following this, sections were washed in PBS and 0.1% Triton X-100. Sections were blocked in 1% normal goat serum. Following blocking, the primary antibody (GFAP; goat anti-chicken) was added and allowed to incubate while shaking for 24 h at 4°C. Tissue was washed three times in PBS and then the secondary antibody was added (Alexa Fluor 555; goat anti-chicken). Tissue was incubated and shook in a light-protected box for 2 h at room temperature. Tissue sections were subsequently handled under light-protective materials. Three additional washes in PBS were followed by the addition of DAPI which was allowed to incubate for 10 min at room temperature. Three additional washes in PBS followed. Sections were then mounted on gelatin subbed glass slides with anti-fade mounting medium (sc-516212 Santa Cruz Biotechnology) and imaged to confirm placement across the ACC. Sections were mounted on gelatin subbed glass slides and then imaged to confirm placement across the ACC.
Spike sorting
Putative neurons were organized into clusters by Kilosort 2 (Pachitariu et al., 2016). Following automatic spike sorting, supervised curation was performed in Phy2 (https://github.com/cortex-lab/phy). Specifically, it was ensured that the autocorrelograms contained no refractory violations, the waveforms were characteristic of an action potential, and the signal was minimally contaminated by any noise artifacts. Following qualitative characterization in Phy2, data were imported into MATLAB for subsequent analyses. A custom MATLAB routine was used to align spike trains to task events. Spike trains were smoothed using Gaussian convolution with a bin width of 200 ms and σ set to 10 ms.
Experimental design and statistical analysis
Data were analyzed using custom MATLAB routines in all experiments. To quantify decision-making latency, reaction times from choice latencies were transformed to ranks and graphed as mean ranks. This was done to limit the positive skew of reaction times. All significance a values were set at 0.05.
In the optogenetics experiment, choices were classified into four different types based off choice (immediate or delay) and i-value (low, i-value <4 and high, i-value >3). Distribution of the four choice types were analyzed using a probability density function (PDF) for Laser ON versus OFF conditions (Fig. 3) to determine how consecutive choices are made for immediate and delay choices with high versus low i-values. This assessed the ability of the animal to deviate from poor choices, such as choosing the immediate lever when the i-value is low or the delay lever with the i-value is high.
To better understand how firing rates differ for immediate and delay choices with high and low i-value, spike trains were aligned to the choice point comprised of an interval of 15 s prior to and 15 s after the choice (−15 to +15 s) and binned at 200 ms, resulting in 151 bins. Using the choice point-aligned spike trains, binned spike counts were smoothed using a moving average filter spanning five bins for each neuron. Firing rates were z-scored and the average firing rate for immediate and delay high/low i-value trials (low, i-value <4 and high, i-value >3) were calculated and plotted for the 4 and 8 s delays (Fig. 4B,C). ANOVAs and post hoc tests were then used to analyze differences between firing rates across trials differing by lever and i-value.
To better understand the changes in neural activity that underlie how a bias for an option may emerge and be abandoned over the course of a session, consecutive trials were analyzed in the electrophysiology animals. To quantity this, three or more consecutive choices on either lever was considered a “run.” Three consecutive choices were defined as the threshold since the animal continued to choose the same lever despite being exposed to the other lever on a forced trial. Runs were then subdivided by three different criteria. First, runs were split into Change or Fail-to-change runs based on the animal's choice on the fourth trial. When animals chose the opposite lever on the fourth trial, this was referred to as “Change.” When the animal stayed on the same lever for the fourth trial, this was referred to as “Fail-to-Change.” Second, runs were further stratified by the lever the animal chose on the first three trials (i.e., immediate, delay). Third, Trials 3 and 4 were selected for analysis (Fig. 5; see key). Each of the three criteria above resulted in eight different trial types to be analyzed at each delay.
To examine differences in neural activity that contribute to either Change or Fail-to-Change, the third and fourth trial of each type of run was first analyzed using a global principal component analysis (PCA). This PCA allowed us to visualize neural activity patterns that unfold over the entire population of neurons. It was performed separately for the 4 s (Fig. 5A,C) and 8 s delays (Fig. 5B,D) and included each dataset within a delay. For all PCAs, spike trains were aligned to the choice point comprising an interval of 19 s prior to and 1 s after the choice (−19 to +1 s) and binned at 200 ms, resulting in 101 bins. Using the choice point-aligned spike trains, binned spike counts were smoothed using a moving average filter spanning five bins for each neuron. Only neurons firing during the third and fourth trial of each type of run were included in the analysis (Table 2; Fig. 5, key) resulting in a time bin X neuron matrix for each of the eight trial types outlined above. Firing rates from each neuron (n = 581 4 s, n = 1,166 8 s; Table 2) were contained in columns and each of the 101 time bins in rows. The eight matrices were then concatenated and z-scored prior to running PCA. For example, the 4 s delay consisted of a matrix (808 × 581) which contained n = 581 neurons, where each row corresponded to firing rates at each time bin (101 time bins per trial type, 808 total) and each column corresponded to the firing rate of an individual neuron. PCA was then conducted to analyze neural activity across Trials 3 and 4 of each of the following runs: Immediate-Fail-to-Change, Delay-Fail-to-Change, Immediate-Change, and Delay-Change. The three most explanatory dimensions were chosen [top 3 principal components (PCs); Fig. 5A–D].
A separate PCA was performed to obtain single session trajectories which allowed differences in neural activity between runs to be further quantified and comparisons between the 4- and 8 s delays. PCA was conducted in a similar manner as described above but separate for each session. This PCA was used to obtain neural trajectories for each individual session that allowed analysis of changes in neural activity patterns from the third to fourth trial of each run (Fig. 5E,F). Each trajectory contained coordinates in PC space corresponding to the top 3 PCs at each time bin (101 × 3). The Euclidean distance was calculated between the third and fourth trial of each run using the trajectory coordinates derived from the top 3 PCs (Fig. 5E,F). Specifically, for each individual session, the pairwise Euclidean distance between the third and fourth trial for each run was calculated for the top three PCs. The diagonal of the pairwise distances was used as the distance metric. The temporal evolution of change in Euclidean distances prior to the choice (Fig. 5E,F) for runs were then subjected to statistical analyses via ANOVA and post hoc comparisons.
Results
There is an equal probability of choosing the immediate or delay option at the 8 s delay
As the delay increased, animals chose the immediate reward more often (Fig. 1C,D). Indifference points across all delays (0, 1, 2, 4, 8, 16 s) were assessed in optogenetic animals for the Laser OFF condition to evaluate the effect of delay on reward value without optogenetic manipulation. A Greenhouse–Geisser corrected RM one-way ANOVA indicated a significant effect of delay on indifference points (F(2.19,15.33) = 16.64, p = 0.0001; Fig. 1C). As delays increased, the indifference points decreased, indicating discounting. The Mazur hyperbolic DD curve was fit to indifference points to calculate the rate of discounting (k = −0.081).
While initially biased toward the delay lever, the strong preference for the delay lever decreased with increasing delays (Fig. 1D), further indicating discounting. The delay impacted the percent of immediate choices made during a session [Kruskal–Wallis (6,120) χ2 = 79.02; p > 0.0001]. The percentage of immediate choices made during a session increased as delays increased. Similar to previous reports from our lab, Wistar rats are equally likely to choose between the immediate and delay level at the 8 s delay (Linsenbardt et al., 2017). These results provide evidence that the preference for the delay lever is no longer observed at the 8 s delay. Collectively, these results indicate the presence of discounting in Wistar rats using this DD paradigm and that more deliberation may be required at the 8 s delay rather than relying on a preference for the delay lever to make decisions.
Optogenetic inhibition of ACC increases impulsive choices at an 8 s delay
Bilateral expression of ArchT in the ACC was present for all animals included in the analyses with expression throughout the ACC with some ventral spread (Fig. 2A). Indifference points for each condition (Laser ON vs Laser OFF) were calculated by averaging the last 10 trials of each session for each delay and a hyperbolic discounting function (Mazur, 1987) was fit to observations in the Laser ON and Laser OFF conditions for the 4, 8, and 16 s delays (curves not shown, Laser ON). To assess differences between Laser ON/OFF conditions an Extra-sum-of-squares F-test was used to determine whether one model accurately describes both conditions. One curve did not adequately fit both conditions (Extra-sum-of-squares F-test: F(1,46) = 10.46, p = 0.002), indicating differences in k-values between conditions. Lower AUC for the Laser ON than OFF was observed (paired samples t test, t(7) = 5.3, p = 0.001). Collectively these data indicate that inhibition of the ACC increases impulsive responding.
To further assess what drove the differences between the Laser ON/OFF conditions, indifference points were compared at 4, 8, and 16 s delays individually (Fig. 2B). Greenhouse–Geisser corrected two-way repeated-measures ANOVA revealed a main effect of delay (F(1.48,10.35) = 27.53, p = 0.0001) and main effect of Laser ON/OFF condition (F(1,7) = 27.53, p = 0.003; Fig. 2B). Post hoc comparisons were used to assess differences in impulsivity at each delay. No differences were observed between Laser ON versus Laser OFF at the 4 s (Holm–Šídák tests, p = 0.84) or 16 s (Holm–Šídák tests, p = 0.09) delays (Fig. 2B). At the 8 s delay, the indifference points for the Laser OFF condition was larger than Laser ON condition (Holm–Šídák test, p = 0.001; Fig. 2B). These results indicate that the indifference points between Laser ON/OFF conditions during the 8 s delay were the major factor in differences between impulsivity measures. The selective effect at the 8 s delay may be attributable to the equal probability of choosing either the delay or immediate lever (Fig. 1D).
The number of sucrose pellets earned during Laser ON and OFF sessions were assessed across the 4, 8, and 16 s delays to evaluate if increased impulsivity led to fewer rewards. The number of pellets earned during choice trials for Laser OFF and Laser ON sessions differed when collapsed across 4, 8, and 16 s delays (Wilcoxon rank sum tests, Z = −3.39, p = 0.0007; Fig. 2C), with Laser OFF sessions earning more pellets. When stratified by delay, the number of pellets earned for Laser ON and OFF sessions during choice trials differed for both 8 s (Bonferroni-corrected Wilcoxon rank sum tests, Z = −2.84, p = 0.01; Fig. 2C) and 16 s delays (Bonferroni-corrected Wilcoxon rank sum tests, Z = −2.55, p = 0.03; Fig. 2C) but not for the 4 s delay (Bonferroni-corrected Wilcoxon rank sum tests, Z = −0.73, p = 1; Fig. 2C). The number of pellets earned during the 8 and 16 s delays were greater for the Laser OFF sessions, indicating that the increase in impulsive choice driven by optogenetic inhibition of the ACC was disadvantageous.
To further assess whether forced trials impacted number of pellets earned and therefore whether the increase in impulsive choices was disadvantageous, number of pellets on both forced and choice trials were assessed for Laser ON and OFF sessions. When accounting for both choice and forced trials, the number of pellets earned on Laser ON and OFF sessions differed across the 4, 8, and 16 s delays (data not shown; Wilcoxon rank sum tests, Z = −1.98, p = 0.048). More pellets were earned during the Laser OFF session. Collectively, these results indicate that optogenetic inhibition of ACC increases impulsive choices, resulting in suboptimal decision-making.
We tested our hypothesis that ACC contributes to strategies that impact choice behavior that emerges across trials. To examine choice sequences, trials were split into four different types based on i-value (low, i-value <4 and high, i-value >3) and choice lever (Fig. 3). This assessed the ability of the animal to deviate from poor choices, such as continuing to press the immediate lever when the i-value was low. It was hypothesized that optogenetic inhibition would disrupt the ability to switch away from continuing to choose the immediate lever when i-value was low or the delay lever when i-value was high. For both the Laser ON/OFF conditions the distribution of consecutive choices was determined and a gamma distribution was fit to the data. The Laser ON and Laser OFF conditions did not differ in number of consecutive immediate choices when i-value was high (Wilcoxon rank sum tests, Z = −0.05, p = 0.96; Fig. 3B), delay choices when i-value was low (Wilcoxon rank sum tests, Z = −0.15, p = 0.88; Fig. 3C), or delay choices when i-value was high (Wilcoxon rank sum tests, Z = −0.60, p = 0.55; Fig. 3D). Laser ON conditions lengthened consecutive immediate choices when i-value was low during the 8 s delay (two-sample Kolmogorov–Smirnov, p < 0.0001, k = 0.29; Wilcoxon rank sum tests, Z = −2.25, p = 0.02; Fig. 3A), providing evidence that optogenetic inhibition disrupted ACC signals required to shift away from poor decisions. Collectively these results suggest that, at the 8 s delay, optogenetic inhibition of ACC increases impulsivity by impacting the animals’ ability to update decision-making, specifically deviating from choosing the immediate lever when the value is low.
Changes in mean firing rate do not explain effects of optogenetics on behavior
To determine why optogenetic inhibition of ACC increased impulsivity at the 8 s delay, male Wistar rats were unilaterally implanted with 64-channel silicone probes in ACC (see Materials and Methods; Fig. 4A), and recordings during the 4 and 8 s delay were analyzed. For both the 4 s (Fig. 4B) and 8 s (Fig. 4C) delays, a Greenhouse–Geisser corrected tree-way repeated-measures ANOVA was performed to evaluate the effect of time (151 bins), lever (immediate or delay), and i-value (high or low) on neural firing rates. There was a significant effect of time (F(4.61,6,102) = 125.09, p < 0.0001) but not i-value (F(0.031,40.68) = 0.81, p = 0.37) or lever (F(0.031,40.68) = 0.53, p = 0.47) on firing rates at the 4 s delay (Fig. 4B). There were significant effects of time (F(3.50,4,344.13) = 74.24, p < 0.0001) and lever (F(0.023,28.96) = 11.37, p = 0.0008) but not i-value (F(0.023,28.96) = 1.49, p = 0.22) on firing rates at the 8 s delay (Fig. 4C). However, at both the 4 and 8 s delays, significant interactions between time, lever, and i-value (4 s, F(4.61,6,102) = 9.18, p < 0.0001; 8 s, F(3.50,4,344.13) = 17.80, p < 0.0001) were detected. Firing rates varied depending on time, i-value, and choice levers for both 4 and 8 s delays.
Peaks in firing rates for immediate choices were shifted to earlier timepoints than delay choices, and higher firing rates were observed for immediate high i-value compared with low i-value choices following the choice point. Benjamini–Yekutieli false discovery rate (FDR)-corrected t tests at each timepoint were conducted to determine the timepoints at which firing rates for high versus low i-value differed for the immediate and delay lever. Differences between firing rates for high and low i-value choices on a given lever were more consistent during the 4 s (Fig. 4B) than the 8 s (Fig. 4C) delay. Notably, during the 8 s delay, differences in firing rates for immediate choices stratified by i-value were greatest following the choice point (Fig. 4C). The infrequency of differences prior to the choice at the 8 s delay was not consistent with the effects of optogenetics at this delay and therefore we analyzed ensemble activity in these recordings.
ACC networks shift from procedural encoding at 4 s to deliberative encoding at 8 s delay
Runs were defined as three or more consecutive choices on the same lever followed by either continuing the run on the fourth trial (Fail-to-Change run) or changing levers and ending the run on the fourth trial (Change run) for the immediate and delay levers (see Materials and Methods). To assess neural activity associated with decision-making, ensemble analyses were conducted on spike trains 19 s prior to and 1 s after the choice was made on a given trial type (Fig. 1A, bottom). Additionally, to better understand neural activity related to shifts in choices, the remainder of the analyses focused on Change or Fail-to-Change runs (see Materials and Methods; Figs. 1B, 5, key). Neural activity was assessed between the third and fourth choice trial in the run (Fig. 5) where the fourth choice could either be to continue to the run (Fail-to-Change) or to shift to the opposite lever (Change). PCA was performed across each trial type on neurons for the 4- and 8 s delays to obtain the neural trajectories of the third and fourth trial of runs. This enabled an assessment of whether neural signatures of continuing with the current run (IM/DEL Fail-Change) differed from those of choosing to abandon the current run on the fourth trial (IM/DEL Change).
PCA reveals qualitatively different trajectories in state space across 4 and 8 s delays
The top three PCs and the trajectories for the eight trial types corresponding to the third and fourth trial of each run are plotted for the 4 s (Fig. 5A,C) and the 8 s delays (Fig. 5B,D). The top 3 PCs explained 65% of the variance for both the 4- and 8 s delays. For the 4 s delay, each of the first three PCs clearly separated features of the task (Fig. 5A,C); where PC1 reflected neural activity patterns related to which lever the animal chose on the run (immediate vs delay; Fig. 5C1), PC2 separated if the animal changed their choice (Change vs Fail-to-Change; Fig. 5C2), and PC3 separated what choice the animal made on the fourth trial (immediate vs delay lever; Fig. 5C3). At the 8 s delay, PC1 still reflected the lever the run was taking place on (Fig. 5D1). However, PC2 less clearly reflected if the animal would change their choice (Fig. 5D2), and PC3 reflected DEL-change or DEL-Fail-to-Change sequences (Fig. 5D3). We also observed that each PC did not vary much in time during the 4 s delay (Fig. 5C) but exhibited ramping-like activity prior to the choice point during the 8 s delay (Fig. 5D). Collectively, these data indicate that the neural dynamics in ACC during decision-making differs across the delays of the DD task. Specifically, at the 4 s delay, the stability of the PCs in time (Fig. 5C) and clearly defined task features (Fig. 5A,C) are consistent with a procedural decision-making strategy where pre-established evidence is used to guide decisions. In contrast, at the 8 s delay, ramping-like activity in the neural trajectories prior to the choice (Fig. 5D) is more consistent with evidence accumulation during a deliberative strategy.
Euclidean distances are larger for change versus fail-to-change runs
To quantify differences in how ACC encodes Change versus Fail-to-Change trials at different delays, Euclidean distances between the trajectories from the third and fourth trials were calculated for each run using PCA from individual sessions (Fig. 5E,F; see Materials and Methods). The rationale for this approach is that updates to a decision-making strategy should require distinct neural activity patterns on the fourth trial. If the trajectories occupy similar PC space between trials (small Euclidean distance), this reflects little change in the neural activity patterns between the third and fourth trials, while large distances reflect larger changes in neural activity. We therefore hypothesized that larger distances between the third and fourth trial would be observed for Change trials compared with their Fail-to-Change counterparts. We also hypothesized that distance measures for Change runs would differ for the 4- and 8 s delay, based on the results of the optogenetics experiments. To test these hypotheses, we evaluated the change in Euclidean distance between trajectories for each run across time.
A Greenhouse–Geisser corrected three-way repeated-measures omnibus ANOVA was run on Euclidean distance between trajectories in PCA space to evaluate the effects of time (101 bins) and runs (IM and DEL, Change and Fail-to-Change) and delay (4- and 8 s). There was a significant effect of delay (F(0.016,0.35) = 23.41, p < 0.0001), runs (F(0.047,1.04) = 20.59, p < 0.0001), and time (F(1.57,34.53) = 6.83, p < 0.0001) on Euclidean distance measures. There was also a significant three-way interaction between time, runs, and delay (F(4.71,103.59) = 4.11, p = 0.001) indicating the temporal evolution of the runs differed across delays.
Euclidean distances differ more on DEL-change runs at the 8 s versus the 4 s delay
This interaction was further interrogated by post hoc comparisons to assess the differences between 4- and 8 s delays runs (Fig. 5E,F, insets). In support of our hypothesis, larger changes in Euclidean distance of DEL-Change and DEL-Fail-to-Change runs were observed during the 8 s delay when compared with the 4 s delay (Scheffe tests: p < 0.0001, DEL-Change; p = 0.007, DEL-Fail-to-Change). However, no differences in Euclidean distance were observed between delays for the IM-Change or IM-Fail-to-Change runs (Scheffe tests: p = 0.43, IM-Change; p = 0.07, IM-Fail-to-Change).
Euclidean distances are stable at the 4 s delay and dynamic at the 8 s delay
To further investigate our observation that distance metrics are generally larger for Change versus Fail-to-Change runs, we then evaluated whether runs differ within each delay. To assess the effects of time (101 timebins) and runs (IM and DEL, Change and Fail-to-Change) on Euclidean distances, a Greenhouse–Geisser corrected two-way repeated-measures ANOVA was run separately for each delay.
At the 4 s delay, there was a main effect of run (F(0.040,0.40) = 10.36, p = 0.0003) but no effect of time (F(1.34,13.36) = 1.77, p = 0.15) on Euclidean distance measures (Fig. 5E). However, the interaction between time by run was significant (F(4.00,40.08) = 2.90, p = 0.03), indicating that although differences were detected between run types, the runs were relatively stable across time with some run types varying more than others. Post hoc comparisons were used to assess the differences between Change and Fail-to-Change runs for the DEL and IM levers (Fig. 5E, inset). In support of our hypothesis, the Euclidean distance between the third and fourth trial was larger for DEL-Change than DEL-Fail-to-Change (Scheffe tests: p = 0.007) but not for IM-Change versus Fail-to-Change runs (Scheffe tests: p = 0.17). The lack of differences between the IM-Change and Fail-to-Change runs at the 4 s delay may partially explain the effects seen during the optogenetic experiment and further support our hypothesis that at the 4 s delay, decisions are more procedural and guided by the delay lever preference.
At the 8 s delay, there was a main effect of run (F(0.030,0.36) = 18.78, p < 0.0001) and time (F(1,12.04) = 10.28, p < 0.0001) on Euclidean distance measures as well as a significant time by run was interaction (F(3.01,36.13) = 7.43, p < 0.0001; Fig. 5F). These results suggest that the 8 s delay is more dynamic than the 4 s delay, which is consistent with the temporal evolution of the PCs (Fig. 5C,D). Further supporting our hypothesis that larger distances would be observed for Change compared with Fail-to-Change runs, post hoc comparisons indicated that the Euclidean distance between the third and fouth trial was larger for both DEL (Scheffe tests: p = 0.002) and IM (Scheffe tests: p = 0.004) Change compared with their respective Fail-to-Change counterparts (Fig. 5F, inset). These data suggest that at the 8 s delay, updating is required for both IM and DEL-Change runs (Fig. 5F). Notably, these data contrast the results observed at the 4 s delay (Fig. 5E), where large changes in neural activity patterns are only evident when switching away from the preferred lever.
Euclidean distances increase prior to choice on 8 s delay change runs
Benjamini–Yekutieli FDR-corrected t tests at each timepoint were conducted to determine the timepoints at which Euclidean distance differed between IM/DEL Change and Fail-to-Change runs during each delay. DEL-Change differed from DEL-Fail-to-Change more consistently during the 8 s delay (Fig. 5F) than the 4 s delay (Fig. 5E) and was especially pronounced leading up to the choice. IM-Change also consistently differed from IM-Fail-to-Change at the 8 s delay (Fig. 5F). These data further support our hypothesis that encoding of runs differ by delay and provides an explanation for why optogenetic inhibition had effects at the 8 s, but not 4 s, delay.
Reaction times at 8 s are consistent with deliberative decision-making
If deliberation was more prevalent at the 8 s than 4 s delay, then changes in response latencies should be observed between the third and fourth trials of runs. At the 4 s delay, animals increased choice latencies between the third and fourth trial for DEL-Change runs (Wilcoxon signed-rank test: Z = −3.74, p = 0.0002; Fig. 6A3) but not for Fail-to-Change (Wilcoxon signed-rank tests: DEL-Fail-to-Change, Z = −0.38, p = 0.70, Fig. 6A4; IM-Fail-to-Change, Z = −0.87, p = 0.38, Fig. 6A2) or IM-Change runs (Wilcoxon signed-rank tests: Z = 1.72, p = 0.09; Fig. 6A1). This indicates that latencies increase only when switching away from the preferred (delay; Fig. 1D) lever at this delay (Fig. 6A). In contrast, at the 8 s delay, choice latencies differed between the third and fourth trial for both the IM-Change (Wilcoxon signed-rank tests: Z = 2.74, p = 0.006; Fig. 6B1) and the DEL-Change (Wilcoxon signed-rank tests: Z = −3.83, p = 0.0001; Fig. 6B3) runs but not for either the IM (Wilcoxon signed-rank tests: Z = 0.09, p = 0.93; Fig. 6B2) or DEL (Wilcoxon signed-rank tests: Z = 0.12, p = 0.90; Fig. 6B4) Fail-to-Change runs (Fig. 6B). While the probability of choosing an immediate or delay choice is roughly equivalent at the 8 s delay (Fig. 1D), animals may still retain biases toward the delay lever that aide in deliberating between the two choice options. Responses were faster when going from IM→DEL and slower when going from DEL→IM and therefore consistent with the animals either switching to or away from their previously preferred option (delay lever) prior to the 8 s delay, which is consistent with deliberating an easy or difficult choice, respectively.
Discussion
The main finding of this study is that two different decision-making strategies are observed in this task; when a clear preference for the delay lever exists (4 s delay), ensembles in ACC reflect procedural decision-making that changes to a deliberative decision-making strategy when a choice preference no longer exists (8 s delay) and is reflected by differences in neural activity associated with runs (Fig. 5). This was supported by changes in reaction times when animals update decision-making strategies (Fig. 6). Optogenetic inhibition of ACC increased impulsive choices exclusively at the 8 s delay; more specifically, it increased the number of consecutive low-value choices. This suggests that disrupting ensemble activity required for deliberative encoding increases measures of impulsivity by preventing updates required to change runs. Collectively, these data indicate that ensemble activity in ACC shifts based on the cognitive demands of the task and impulsive choices emerge from failure to engage a deliberative decision-making strategy.
One explanation for animals failing to update behavior is that optogenetic inhibition of ACC disrupted deliberative processes that were more prevalent during the 8 s delay. Decisions between the immediate and delay choices are most difficult at the 8 s delay given that the number of immediate and delay choices are roughly equal (Linsenbardt et al., 2017; Fig. 1D). A number of studies have indicated that ACC is critically involved in difficult decisions requiring more deliberation and less deliberation occurs for difficult decisions when the ACC is inhibited (Papale et al., 2012; G. J. Peters et al., 2013; Peters and Smith, 2020; Schmidt et al., 2019).
The differences in ACC ensemble activity surrounding runs between delays were uncovered by examining PC spaces. When a clear preference for the delay lever exists (4 s delay; Fig. 1D), neural activity across PC spaces were relatively static leading up to the choice (Fig. 5C,E). This indicates that ACC neurons distribute the encoding of task features across the population of neurons in a static manner, which existed prior to optogenetic inactivation (10 s prior to the choice). This may explain the lack of effect of the optogenetic inhibition at the 4 s delay—any information that ACC might contribute to the decision at this delay preceded the inactivation. In addition, a procedural strategy provides an efficient way to arrive at a good decision by limiting the need for deliberation and using evidence that was previously established.
In contrast, at the 8 s delay, neural ensembles were more dynamic prior to the choice on Change runs (Fig. 5D,F), which also corresponded to the period of time that ACC was inhibited. These data are consistent with the view that, during DD, ACC plays a critical role in decision-making strategies (Powell and Redish, 2016). Our data support and extend this view by indicating that the ACC may be uniquely involved in decisions requiring deliberation.
The behavior of the trajectories in the PC spaces provides insight into how ACC might implement the computations that control decision-making at each delay. While the differences across each delay were robust, a limit to making inferences about these spaces is the shortcomings of PCA. While useful for dimensionality reduction, PCA may not be sufficient to capture dynamics that occur in a high dimensional space given the linear nature of the algorithm (Whiteway and Butts, 2019). Nonetheless, several important inferences about computation have been made using PCA that are supported by analysis tools better equipped to describes dynamics in a high-dimensional space (Durstewitz, 2017).
The differences in the PC spaces across delays suggest that each delay exhibits a different degree of stability. The neural trajectories for each of the trial types at the 4 s delay were restricted to a discrete region of state space (Fig. 5A) that were generally well separated from other trial types, suggesting the existence of several meta-stable states—collectively referred to as “multistability” (Brinkman et al., 2022). This suggests that multistability may be necessary for procedural decision-making. In line with this view, prior work from our group examined ACC neural dynamics in well-trained animals during a foraging-based decision-making task (Lapish et al., 2008; De Falco et al., 2019). These studies found that neural trajectories in ACC track task variables by moving through several meta-stable states that reflect discrete features of the task. This is reminiscent of the behavior of the neural trajectories in the current study at the 4 s delay where procedural encoding was observed.
Procedural encoding initially seemed to be present at the 8 s delay. However, at ∼12 s prior to the choice, PC spaces begin to evolve where PC2 and PC3 each begin to move upward (Fig. 5D2,D3). The change in the way neural trajectories move through state space at this time suggests a change in the dynamic properties of ACC networks. We speculate that this change reflects a bifurcation in the systems dynamics that is required to perform deliberation. After this time, neural trajectories seem to take linear paths that give way to rotational paths near the choice. This type of linear to rotational dynamics is reminiscent of that observed in prefrontal networks of the nonhuman primate, which has been suggested to correspond to decision commitment (Mante et al., 2013; Aoi et al., 2020).
Collectively, these observations form the basis of hypotheses that can be directly tested with modern techniques to reconstruct latent dynamics from neural recordings (Durstewitz et al., 2010; Balaguer-Ballester et al., 2011a,b; Durstewitz, 2017). Specifically, if a bifurcation exists and options are encoded via attractor dynamics, characterizing this will provide mechanistic insight into how deliberation is implemented in ACC networks. This would also be important to understand how breakdowns in the computations responsible for deliberation result in impaired decision-making. Identifying methods to repair these computations could be a powerful approach to reduce impulsivity and thereby improve treatment outcomes in several psychiatric disorders.
Footnotes
The authors declare no competing financial interests.
We thank Amanda Callahan for their assistance in conducting the optogenetic experiment. The work was supported by National Institutes of Health grants AA029409, P60-AA007611, and T32AA007462.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.