Excitatory and Inhibitory Signaling in the Nucleus Accumbens Encode Different Aspects of a Pavlovian Cue in Sign Tracking and Goal Tracking Rats

Abstract When a Pavlovian cue is presented separately from its associated reward, some animals will acquire a sign tracking (ST) response – approach and/or interaction with the cue – while others will acquire a goal tracking response – approach to the site of reward. We have previously shown that cue-evoked excitations in the nucleus accumbens (NAc) encode the vigor of both behaviors; in contrast, reward-related responses diverge over the course of training, possibly reflecting neurochemical differences between sign tracker and goal tracker individuals. However, a substantial subset of neurons in the NAc exhibit inhibitory, rather than excitatory, cue-evoked responses, and the evolution of their signaling during Pavlovian conditioning remains unknown. Using single-neuron recordings in behaving rats, we show that NAc neurons with cue-evoked inhibitions have distinct coding properties from neurons with cue-evoked excitations. Cue-evoked inhibitions become more numerous over the course of training and, like excitations, may encode the vigor of sign tracking and goal tracking behavior. However, the responses of cue-inhibited neurons do not evolve differently between sign tracker and goal tracker individuals. Moreover, cue-evoked inhibitions, unlike excitations, are insensitive to extinction of the cue-reward relationship. Finally, we show that cue-evoked excitations are greatly diminished by reward devaluation, while inhibitory cue responses are virtually unaffected. Overall, these findings converge with existing evidence that cue-excited neurons in NAc, but not cue-inhibited neurons, are profoundly sensitive to the same behavior variations that are often associated with changes in dopamine release.


Significance Statement
Many neurons in the nucleus accumbens (NAc) are excited by environmental cues that predict reward, while others are inhibited by such cues.Cue-excited neurons closely track animals' behavioral responses, and even encode a specific form of learning used by sign trackers (ST)individuals who tend to approach cueswho are prone to impulsivity and addiction-related behavior.Here, we show that cue-excited and cue-inhibited neurons seem to be involved in learning about cues, but only cue-excited neurons flexibly change their responses when cues no longer signal reward (extinction) or when the reward is no longer desired.These findings imply that different aspects of cue-reward learning are supported by different populations of NAc neurons defined by specific activity patterns.

Introduction
Cues that are repeatedly associated with reward can acquire incentive salience: the ability to elicit motivated approach and interaction (Berridge, 2004).Notably, people and nonhuman animals vary widely in the degree to which such cues influence their behavior.In the behavior known as sign tracking (ST), animals approach and interact with a reward-associated cue (e.g., an extended lever) even when it is not located at the site of reward (Hearst and Jenkins, 1974); in contrast, other animals display the behavior known as goal tracking (GT), in which they approach the location of reward delivery rather than the cue (Boakes, 1977).A predisposition toward ST has been linked with various forms of impulsivity, risk-taking, and addiction-related behaviors, including drug-seeking and relapse (Tomie et al., 2008;Saunders and Robinson, 2013).
A considerable body of literature suggests that ST and GT involve different profiles of neurophysiological and neurochemical activity.In particular, sign tracking, but not goal tracking, is dependent on dopamine in the nucleus accumbens (NAc; Saunders and Robinson, 2012), and the profile of NAc dopamine release in sign tracker, but not goal tracker, individuals appears to reflect a reward prediction error signal (Flagel et al., 2011) thought to be used for model-free learning (Clark et al., 2012;Huys et al., 2014).We have previously shown that neural signaling in the NAc also reflects this pattern: reward-evoked excitatory responses show a gradual decrease during learning in sign tracker but not goal tracker subjects (Gillis and Morrison, 2019).On the other hand, cue-evoked excitatory responses encode the vigor (probability, rapidity, and intensity) of both sign-tracking and goal-tracking behavior.This is consistent with earlier findings regarding the role of NAc signaling in cue-directed behavior in both Pavlovian and instrumental contexts (Day et al., 2006;McGinty et al., 2013;Morrison and Nicola, 2014;Morrison et al., 2017).
There is substantial evidence that NAc dopamine is necessary for invigorating approach to reward-related targets (Nicola et al., 2005;Chow et al., 2016), and that it does so by enhancing excitatory responses to cues that are paired with reward (du Hoffmann and Nicola, 2014).On the other hand, inhibitory responses in the NAc, which are exhibited by 20-30% of neurons (Morrison et al., 2017), are far less sensitive to dopaminergic manipulation: blockade of either D1 or D2 dopamine receptors in the NAc strongly attenuates cue-evoked excitations but not inhibitions (du Hoffmann and Nicola, 2014).Because sign tracker and goal tracker individuals are thought to have different patterns of dopamine release (Flagel et al., 2011), this led us to ask whether we would see different patterns of activity in cue-excited neurons (over 50% of the population; Morrison et al., 2017) versus cue-inhibited neurons in animals performing ST or GT behavior.We also wondered whether behavioral manipulations that are known to affect NAc dopamine release, such as extinction (Sunsay and Rebec, 2014) and reward devaluation (Aitken et al., 2016;Papageorgiou et al., 2016;Keiflin et al., 2019), would have a divergent impact on the two neuronal populations.
In order to address these questions, we combined an existing data set (Gillis and Morrison, 2019) with additional single-unit recordings in the NAc during the acquisition, maintenance, and extinction of sign tracking and goal tracking behavior.This allowed us to amass a large enough population of cue-inhibited neurons for analysis.Based on previous findings during instrumental tasks (Morrison et al., 2017), we hypothesized that cue-inhibited NAc neurons, like cue-excited neurons, would encode the vigor of sign tracking and goal tracking behavior; but that the reward responses of cue-inhibited neurons, unlike cue-excited neurons, would show little change over the course of learning in sign tracker individuals.Similarly, we hypothesized that manipulations such as cue extinction and/or reward devaluation would selectively impact cue-evoked excitatory responses, but not inhibitory responses.

Materials and Methods
All animal procedures were performed in accordance with the University of Pittsburgh animal care committee's regulations.

Subjects
Subjects were 11 male Long-Evans rats obtained from Charles River Laboratory.Rats weighed 275-325 g on arrival and were pair-housed until surgery (see below).Rats were maintained on a 12/12 h reverse light/dark cycle with lights on at 7 P.M.; all experiments were performed during the dark period.Rats were allowed to acclimate to the housing colony for at least 5 d and were then gently handled over at least two sessions before surgery.Food was provided ad libitum until 2 d before the start of behavioral training, at which time subjects were mildly food-restricted (15 g/d).Water was provided ad libitum throughout.Rats were weighed regularly and provided with extra food if necessary to ensure they maintained 90% of prerestriction body weight.

Electrode arrays and surgery
Electrode arrays were constructed in house and consisted of eight Teflon-insulated tungsten microwires (A-M Systems) arranged in either a circular pattern (1 mm in diameter) or loose bundle.Electrodes were hand cut to achieve an impedance of 90-110 KV.We implanted arrays bilaterally in a fixed position targeted at the NAc core (AP: 11.4; ML: 6 1.5; DV: À7.0-7.2 from dura).Rats were anesthetized using isoflurane (4% for induction, 1-2% for maintenance) and treated with ketoprofen (5 mg/kg) during surgery, then provided with either ketoprofen (5 mg/ kg, s.c.) or Tylenol in their drinking water for 3 d following surgery.Animals were allowed to recover for at least 7 d before the commencement of behavioral training.

Histology
Electrode sites were labeled by passing direct current through each electrode (75 mA for 10 s) while rats were deeply anesthetized with chloral hydrate or pentobarbital.Animals were then transcardially perfused with 0.9% saline followed by 10% buffered formalin.Brains were postfixed in formalin, then transferred to 30% sucrose for at least 3 d before being sectioned at 60 mm on a cryostat.Slices were stained with cresyl violet and electrode placements were confirmed using light microscopy.

Apparatus and behavior
All training and recording took place in the same standard operant chambers (Coulbourn Instruments), which were equipped with a house light, a speaker, and a pellet dispenser mounted above a food magazine recessed into the side wall.An infrared photodetector inside the magazine detected entries and exits.Either one or two retractable levers were present next to the food magazine (side counterbalanced between subjects), and a white cue light was mounted over each lever that was present.When two levers were present, only one was used for a particular subject.For some sessions, three nosepoke operanda were present on the opposite side of the chamber from the food magazine; these were inactive.The task was controlled by Coulbourn software (GraphicState 3.0 or 4.0).
Rats were trained on a Pavlovian conditioned approach (PCA) task as reported previously (Gillis and Morrison, 2019).Briefly, rats were initially trained to retrieve sucrose pellets (BioServ, 45 mg) from the food magazine over two daily sessions.Each session consisted of 50 pellets delivered on a variable time schedule averaging 60 s.During the second magazine training session, rats were habituated to the recording apparatus.Subsequently, rats were trained for 7 daily sessions on the PCA task, which consisted of 25 trials separated by an intertrial interval randomly selected from a truncated exponential distribution averaging 60 s.On each trial, the cue, consisting of an extended lever accompanied by a flashing cue light (5 Hz), was presented for 8 s.The cue was accompanied by an auditory signal (1 s, 500-Hz intermittent tone).After 8 s, the lever retracted, the cue light was extinguished, and a sucrose pellet was delivered to the food magazine.The rat was not required to perform any action for the reward to be delivered.
Neuronal recording took place on all 7 d of acquisition of the PCA task.After the completion of training, a subset of rats underwent recording during one to two extinction sessions.(When .1-2d had elapsed following the last training day, rats were given 1 d of retraining before extinction.)Extinction sessions were identical to the PCA task except that no rewards were delivered.Analyses shown are for the first extinction session only.
Another subset of rats underwent a reward devaluation procedure before the extinction session.After the completion of training, these rats were given alternating days of sucrose pellets delivered in the same operant box used for recording, followed by an injection of lithium chloride (LiCl; 0.6 M, 5 ml/kg, i.p.), or exposure to the recording context only (no sucrose pellets) followed by an injection of sterile saline (5 ml/kg).Sucrose pellets were delivered on a variable time schedule averaging 30 s; rats were initially given 40 pellets, then decreasing to 30, 20, or 10 in subsequent sessions based on the amount they actually consumed.Devaluation sessions continued until rats did not eat any pellets offered (generally four to five exposures).

Electrophysiology
Neural data were recorded using Plexon hardware and software.Rats were connected to a lightweight 16-channel headstage and either a motorized or nonmotorized commutator; both allowed freedom of movement throughout the operant box.Voltages were bandpass filtered between 220 Hz and 6 kHz, amplified 500Â, and digitized at 40 kHz.Spikes were stored in 1.4 ms segments and hand-sorted using principal component analysis and visual inspection of waveform features (Offline Sorter, Plexon).Units were analyzed only if they were .75mV, had a signal-to-noise ratio of .2:1,and had ,0.1% interspike intervals ,2 ms. Isolation of units was verified using autocorrelograms, as well as cross-correlograms for units recorded on the same electrode.

Analysis of behavior
We quantified the degree to which subjects engaged in sign tracking or goal tracking using a PCA index (Meyer et al., 2012).The PCA index is a composite of three ratios, the probability index, bias index, and latency index, which all vary from À1.0 to 11.0.In each case, negative numbers indicate a preference for interacting with the food magazine (goal tracking), and positive numbers indicate a preference for interacting with the lever (sign tracking).The probability index is calculated as P lever -P magazine , where P is probability of the indicated action.The bias index is calculated as (#lever press -#magazine entry)/ (#lever press 1 #magazine entry).The latency index is calculated as (magazine latencylever latency)/(cue length).For trials in which a behavior is not performed, the latency for that behavior is defined as the cue length (8 s).

Analysis of neural data
All analyses were performed using custom-written programs in MATLAB.See Table 1 for detailed statistics.Cue-excited and cue-inhibited neurons were identified as previously described (Morrison et al., 2017;Gillis and Morrison, 2019).Briefly, we defined a Poisson distribution approximating the baseline firing rate in the 1 s before cue onset.We identified cue-excited and cue-inhibited neurons by the presence of at least three consecutive 10-ms bins in which the firing rate respectively exceeded the upper or was less than the lower 99.9% confidence interval of the baseline distribution.The time window for each inhibition was determined by identifying at least three consecutive bins in which the firing rate returned to or exceeded either the baseline firing rate or the firing rate in the 1 s before the start of the inhibition, whichever was lower.If no "inhibition off" time was identified in the 500 ms following cue onset, the duration of the inhibition was defined as 500 ms.
If both excitatory and inhibitory responses were identified within 500 ms after cue onset, we examined the mean Z-score in 200 and 500 ms following cue onset.If both of these were positive, the neuron was categorized as cue-excited; if both were negative, the neuron was categorized as cue-inhibited.In the rare case that one was positive and one was negative, the Z-score with larger absolute value determined how the neuron was categorized.
Peristimulus time histograms (PSTHs) for individual neurons were calculated in 10-ms bins and are shown smoothed using a five-bin moving average.Population PSTHs were calculated in 10-ms bins and normalized by Z-scoring relative to baseline (1 s before cue onset) before averaging across neurons; the average activity was then smoothed using a five-bin moving average.
Analyses of individual neurons were performed using activity from the 500 ms following cue onset for cue-excited neurons, or in variable windows customized for each cue-inhibited neuron, as previously described (Morrison et al., 2017).The start of each custom window was the "inhibition on" time determined by the algorithm described above, and the end was either the "inhibition off" time determined by the algorithm or 500 ms, whichever was smaller.Among cells recorded on the last day of training, inhibition on times ranged from 10 to 180 ms (mean ¼ 62 ms).The earliest inhibition off time was 260 ms, although most were much greater (mean ¼ 443 ms).
In some cases, we used ROC (receiver operating characteristic) analysis to generate an index (e.g., vigor index) comparing two distributions of firing rates.For these indexes, which are derived from the area under an ROC curve, a value of 0.5 indicates that the two distributions are indistinguishable.We calculated a p-value for each index by randomly re-shuffling the data 1000 times (permutation test).
Within extinction sessions, we identified neurons that did or did not extinguish their cue-evoked responses using a one-way ANOVA with trial number as a continuous variable.This was applied to firing rates in the 500 ms following cue onset (cue-excited neurons) or within custom windows (cue-inhibited neurons).Extinguishing cells were defined by p-values , 0.01, along with a decreasing cue response (lower firing rate for cue-excited neurons, higher firing rate for cue-inhibited neurons).

Results
We used custom-built electrode arrays to record from the nucleus accumbens (NAc) core while rats (n ¼ 11) acquired a Pavlovian conditioned approach (PCA) task that typically elicits sign tracking (ST), goal tracking (GT), or a combination of these behaviors.ST behavior is represented by lever deflections, indicating interaction with the cue; GT behavior is represented by food magazine entries, indicating interaction with the site of reward.As we and others have done before (Meyer et al., 2012;Morrison et al., 2015;Gillis and Morrison, 2019), we quantified individuals' propensity for ST versus GT behavior using a composite PCA index (see Materials and Methods).The PCA index ranges from À1.0 (only GT behavior) to 11.0 (only ST behavior).By the end of training, subjects showed a range of ST and GT behavior, as illustrated by their average PCA index over the last 3 d of training (Fig. 1A); however, most subjects performed some degree of GT, resulting in a PCA index distribution that was negatively skewed.Therefore, we operationally defined "sign trackers" as those subjects with a PCA index greater than the population mean (Fig. 1A, arrowhead), and "goal trackers" as those with a PCA index less than the mean (n ¼ 5 sign trackers, n ¼ 6 goal trackers).This resulted in categorizing as sign trackers only those subjects having an appreciable amount of interaction with the lever.Over 7 d of training, the PCA index of sign trackers gradually increased, while the PCA index of goal trackers stayed relatively stable (Fig. 1B).

NAc excitations and inhibitions encode aspects of sign tracking and goal tracking behavior
We recorded from 169 neurons during the final (seventh) day of training, including 122 from a data set on which we have previously reported (Gillis and Morrison, 2019).Of these neurons, 79 (47%) exhibited cue-evoked excitatory responses (58 of which were analyzed in the prior report), and 47 (28%) exhibited cue-evoked inhibitory responses, similar proportions to those found in earlier studies (Morrison et al., 2017).Of the neurons with cue-evoked excitations, 29 were recorded in sign tracker individuals and 50 in goal trackers; of the neurons with cue-evoked inhibitions, 12 were recorded in sign trackers and 35 in goal trackers.The event-related activity of cue-excited and cue-inhibited neurons is summarized in Figure 1C-F: both excitations and inhibitions varied in intensity but were mainly phasic and brief, although inhibitions tended to be more sustained.Notably, most neurons with cue-evoked excitations also exhibited phasic excitatory responses to reward delivery (Fig. 1D), while most neurons with cue-evoked inhibitions, with a couple of prominent exceptions, exhibited phasic inhibitory responses to reward delivery (Fig. 1F).Histologic reconstruction (Fig. 2) shows that most recorded neurons were clearly in the NAc core, although a handful may have been on the border with shell.
We have previously shown that cue-evoked excitations encode information about the vigorthe speed and intensityof actions directed toward a reward-associated target in both an instrumental (Morrison et al., 2017) and Pavlovian (Gillis and Morrison, 2019) context.Notably, in the case of the PCA task, cue-evoked excitations encoded the vigor of both ST and GT behavior.In an instrumental context, cue-evoked inhibitions also encoded aspects of locomotor vigor, such as motion onset latency (Morrison et al., 2017).Therefore, we first asked whether cue-evoked inhibitions encode the vigor of ST and/or GT behavior during the PCA task.In Figure 3A-D, two example neurons recorded from the same subject on the last day of training demonstrate that both excitatory and inhibitory responses may be related to vigor.The cue-excited neuron shown in Figure 3A,B exhibits a stronger phasic response when the subsequent actions (in this case, magazine entries) are conducted with shorter latency.Similarly, the cue-inhibited neuron shown in Figure 3C,D exhibits a deeper, more complete inhibition when the subsequent magazine entry occurs with shorter latency.
In order to quantify vigor encoding across the neuronal population, we used ROC analysis to calculate a "vigor index" comparing neural activity on trials with short latency (,50th percentile) or long latency (!50th percentile) to first action (lever press or receptacle entry).A vigor index of 0.5 indicates no difference between neural responses in the two conditions.We previously reported that cue-excited NAc neurons encoded action latency as a population (Gillis and Morrison, 2019), indicated by a significant shift away from 0.5.In the current data set, the vigor index is no longer significantly shifted away from 0.5 (Fig. 3E; p ¼ 0.21, Wilcoxon signed-rank test), although the shift is still significant if only GT actions are considered (p ¼ 0.04; data not shown) or if a shorter time window (250 ms after cue onset) is analyzed (p ¼ 0.05; data not shown).Similarly, the vigor index for cue-evoked inhibitions is not significantly shifted away from 0.5 (Fig. 3F; p ¼ 0.18); however, a number of individual neurons have a vigor index ,0.5, indicating that they are more strongly inhibited before faster actions.Overall, this is consistent with the previously observed smaller, more variable contribution of vigor to activity in cue-inhibited neurons (Morrison et al., 2017).

Excitations and inhibitions in NAc evolve differently during acquisition of behavior
We next examined whether and how cue-related and reward-related excitations and inhibitions evolve over the course of training.We previously reported that cueevoked excitations in the NAc show little change over 7 d of training in both sign trackers and goal trackers (Gillis and Morrison, 2019), consistent with the observation that cue-evoked dopamine release changes only subtly, if at all, in outbred sign tracker rats (Flagel et al., 2011).In the current data set, among cue-excited neurons, we again found little difference between population average cueevoked activity on the first day versus the last day of training (Fig. 4A; p ¼ 0.72, Wilcoxon rank-sum test).Among cue-inhibited neurons, cue-evoked responses strengthened over the course of training in some time windows (Fig. 4B), although this difference was not significant when the magnitude of inhibition was calculated in customized time windows for each neuron (p ¼ 0.9).However, inhibitory responses became significantly better represented among the population, increasing from 15% to 28% of recorded neurons (Fig. 4C; The data shown in Figure 4D suggests some differences in the proportions of cue-excited and cue-inhibited neurons among sign tracker and goal tracker individuals.Indeed, sign trackers yielded a significantly higher proportion of cue-excited cells on the first day of training compared with goal trackers (p , 0.001, x 2 test).There was also a trend in this direction on the last day of training (p ¼ 0.08, x 2 test for "excited" vs "nonexcited" cells), although the proportions of the three cell types were not different overall (p ¼ 0.14).
As we have demonstrated previously (Gillis and Morrison, 2019), cue-evoked excitations were similar among sign tracker and goal tracker subjects on the last day of training (Fig. 5A); however, excitatory responses to reward differed markedly between the two (Fig. 5B), with sign trackers showing an attenuated response compared with goal trackers (p ¼ 0.004, Wilcoxon rank-sum test).This may be a downstream result of the difference in reward-evoked dopamine release between sign trackers and goal trackers (Flagel et al., 2011), which is thought to reflect differences between the groups in the use of a dopaminergic reward prediction error for model-free learning (Clark et al., 2012;Huys et al., 2014).This led us to ask whether these intergroup differences in dopamine dynamics might be reflected in cue-evoked and/or reward-evoked inhibitory responses in the NAc.On the contrary, we found little or no difference in inhibitory responses to either the cue (Fig. 5C) or, importantly, to reward (Fig. 5D), on the last day of training (reward response, p ¼ 0.3).This may be related to the relative insensitivity of cue-evoked inhibitions to dopaminergic manipulations (du Hoffmann and Nicola, 2014).
Cue-evoked excitations, but not inhibitions, are attenuated by extinction and reward devaluation We previously reported that about half of NAc cueevoked excitations "extinguish" in concert with behavior during an extinction session (Gillis and Morrison, 2019), although another subset retain their response to the cue even after behavior (sign tracking and/or goal tracking) is extinguished.Excitatory cue responses were slower to extinguish in sign trackers, consistent with the finding that sign tracking, compared with goal tracking, is relatively resistant to extinction (Ahrens et al., 2015).We reasoned that the attenuation of excitatory responses might be attributed, at least in part, to a reduction in NAc dopamine release in response to the extinguished cue (Sunsay and Rebec, 2014).Therefore, we wondered whether cue-evoked inhibitions, which are relatively insensitive to blockade of dopamine receptors (du Hoffmann and Nicola, 2014), might be more resistant to extinction than cue-evoked excitations.
The current data set includes 91 neurons recorded during an extinction session, including 63 that were cue-excited and 23 that were cue-inhibited.Two example cells, recorded from the same subject (a sign tracker) during the same extinction session, are shown in Figure 6A,B.The representative cue-excited cell (Fig. 6A) significantly decreases its firing rate over the course of 25 extinction trials (p , 0.001, one-way ANOVA), similar to the "extinguishing cells" that made up 46% of cue-excited neurons in a previous study (Gillis and Morrison, 2019), while the representative cue-inhibited cell (Fig. 6B) does not significantly decrease its response (p ¼ 0.18).This pattern was evident in the population as a whole: among cue-excited cells (Fig. 6C), the population average response significantly decreased over the course of extinction for both sign trackers and goal trackers (two-way ANOVA, main effect of trial number, F (1,1546) ¼ 125.5, p , 0.001).As previously observed, this decrease was slightly more gradual for sign trackers than goal trackers (Fig. 6E).Among cue-inhibited cells (Fig. 6D), on the other hand, inhibitions did not decrease in magnitude, instead remaining stable over the course of extinction (two-way ANOVA, main effect of trial number, F (1,546) ¼ 0, p ¼ 0.97).Notably, by the end of extinction, sign trackers' inhibitory responses were stronger than those of goal trackers (Fig. 6F; p ¼ 0.005, Wilcoxon rank-sum test).
Finally, we examined the responses of cue-excited and cue-inhibited neurons in the NAc to cues following reward devaluation, a manipulation that has been shown to reduce dopamine release in response to cues associated with the devalued outcome (Aitken et al., 2016;Papageorgiou et al., 2016).We and others have reported that, similar to extinction, sign tracking, compared with goal tracking, is relatively  insensitive to reward devaluation (Morrison et al., 2015;Keefer et al., 2020).On the other hand, some studies have shown that sign tracking can be diminished following reward devaluation under certain conditions, especially if the devaluation is conducted in the same context where the ST behavior is performed (Derman et al., 2018;Amaya et al., 2020).In the current experiment, we chose to perform a repeated "in-context" reward devaluation via taste aversion conditioning (similar to Amaya et al., 2020).Consistent with recent reports, we found that, under these conditions, both sign tracking and goal tracking responses were reduced to near zero (Fig. 7A).
We recorded from 46 neurons following reward devaluation; of these, 17 were algorithmically identified as cueexcited and 18 as cue-inhibited.Overall, we found that NAc excitatory responses to the cue were profoundly reduced during an extinction session preceded by reward devaluation (Fig. 7B) compared with an extinction session that was not (two-way ANOVA, devaluation status Â trial number, F (1,2072) ¼ 17.2, p , 0.001).Cue-evoked activity was significantly impacted by devaluation within the first five trials of the session (p , 0.001, Wilcoxon rank-sum test).Cue-evoked inhibitions, on the other hand, were not diminished following reward devaluation (Fig. 7C), just as they showed little to no effect of extinction (two-way ANOVA, devaluation status Â trial number, F (1,1121) ¼ 0, p ¼ 0.96).If anything, inhibitory responses became slightly more intense; however, they were not significantly different, over the first five trials, from inhibitory responses during an extinction session that was not preceded by reward devaluation (p ¼ 0.37, Wilcoxon rank-sum test).Likewise, population average activity showed that reward devaluation results in a profound decrease in cue-evoked excitation (Fig. 7D), but that cue-evoked inhibition remains virtually unchanged (Fig. 7E).Overall, NAc cue-evoked excitations closely tracked behavior as it diminished during extinction and/or reward devaluation, but cue-evoked inhibitions were uncoupled from behavior.

Discussion
The nucleus accumbens has often been thought of as a hub for integrating motivational information about a cue or context with cognitive information, such as goal selection, and promoting the appropriate motor response (Mogenson et al., 1980;Ikemoto and Panksepp, 1999;Nicola, 2010;Morrison et al., 2017).The activity of individual neurons in the NAc has been shown to encode information about the proximity and reward value of a cue that elicits approach, along with locomotor factors such as motion onset latency and movement speed (McGinty et al., 2013;Morrison and Nicola, 2014;Morrison et al., 2017).This is true in Pavlovian as well as instrumental tasks: in the context of the Pavlovian conditioned approach task used here, cue-evoked excitatory responses encode the vigor of both sign tracking and goal tracking (Gillis and Morrison, 2019), although the two behaviors are thought to result from different learning processes (Clark et al., 2012;Huys et al., 2014;Lesaint et al., 2014).
In the current study, we highlight key differences between the coding properties of NAc neurons with different signaling profiles: excitatory versus inhibitory responses to a cue that predicts reward.While studies have shown that ;50% of NAc neurons exhibit phasic excitations in response to a reward-associated cue, an additional 25-30% exhibit phasic inhibitions (Morrison et al., 2017).We were motivated by the observation that cue-evoked inhibitions, compared with excitations, are far less sensitive to dopaminergic manipulations such as dopamine receptor antagonists (du Hoffmann and Nicola, 2014).Therefore, we wondered whether inhibitions were less sensitive than excitations to behavioral differences thought to be related to different profiles of dopamine release (Flagel et al., 2011) sign tracking versus goal tracking, and/or behavioral manipulations, such as extinction and reward devaluation, that have been shown to affect dopamine release in the NAc (Sunsay and Rebec, 2014;Aitken et al., 2016;Papageorgiou et al., 2016) .
We found that NAc cue-evoked inhibitions become more prevalent over the course of training on a Pavlovian conditioned approach task, and that inhibitions, like excitations, may encode behavioral vigor.On the other hand, we found that behavioral differences that are associated with profound changes in NAc excitatory responses do not affect inhibitory responses in a similar way.For example, phasic inhibitions associated with reward delivery, unlike excitations, are not different between sign trackers and goal trackers.Moreover, cue-evoked inhibitions, unlike many cue-evoked excitations, are not affected by extinction of the association between cue and reward.Finally, using a reward devaluation procedure that abolishes both sign tracking and goal tracking behaviors (Amaya et al., 2020), we show that cue-evoked excitations become profoundly attenuated, whereas cueevoked inhibitions remain stable.Thus, as a population, cue-excited cells in the NAc closely track behavior; but cue-inhibited cells, at least once their response is established, do not.

NAc neural signaling during acquisition of conditioned behavior
Cue-responsive neurons in the NAc, including both cue-excited and cue-inhibited cells, were present from the very first day of training.As we have previously reported (Gillis and Morrison, 2019), cue-excited cells do not increase their cue-evoked response over the course of the first training session, although sign trackers (but not goal trackers) show a pronounced decrease in reward-evoked excitation.
The presence of cue-evoked excitations at the start of training is consistent with recent studies showing robust NAc dopamine release to novel cueseven when they are not (yet) associated with rewardwhich decreases as cues become familiar (Morrens et al., 2020;Kutlu et al., 2022).
We did not have a sufficient population of cue-inhibited neurons from sign trackers on the first day of training to evaluate whether their cue-evoked or reward-evoked inhibitions changed during early learning, although we did observe that at least some inhibitions were present from the earliest trials.Notably, however, unlike cue-evoked excitations (Gillis and Morrison, 2019), cue-evoked inhibitions increased in both intensity and representation across the population over the course of training.This implies that cue-inhibited NAc neurons exhibit plasticity related to the acquisition of ST and/or GT behaviors; this neural plasticity is unlikely to be related to changes in dopamine release, since inhibitions are largely insensitive to dopaminergic manipulation (du Hoffmann and Nicola, 2014).Recent evidence indicates that NAc cue-evoked excitations change in concert with behavior during instrumental learning in an NMDA receptor-dependent manner (Vega-Villar et al., 2019); additional experiments are needed to determine whether inhibitions also exhibit NMDAR-dependent plasticity.
We found that at least some cue-evoked inhibitions encode the vigor of subsequent ST and GT behavior, consistent with NAc cue-evoked inhibitions encoding such factors as motion onset latency during instrumental tasks (Morrison et al., 2017).The dual encoding of ST and GT behavior supports the view of the NAc as a hub for invigoration of behavior stemming from different forms of reward learning.Many authors have argued that ST is the result, at least in part, of a dopamine-dependent modelfree learning process, and that GT results from a separate, non-dopamine dependent process that utilizes the predictive qualities of the cue without transferring incentive salience from reward to cue (Clark et al., 2012;Lesaint et al., 2014Lesaint et al., , 2015)).The NAc appears to participate in promoting both types of motivated behavior via both excitatory and inhibitory responses to reward-associated cues.

Divergent relationships of excitatory and inhibitory NAc neural signals with behavior
In the current study, we did not directly measure or manipulate dopamine; however, others have observed changes in dopamine release specific to sign trackers, including enhanced NAc dopamine release in response to a reward-predictive cue over the course of training, along with a reduced response to reward delivery (Flagel et al., 2011).Goal trackers, in contrast, did not show major changes in NAc dopamine release during learning.If this is the case in the current subject population, then cue-excited neurons in the NAc might reflect these differences in dopamine release: their response to reward delivery is substantially reduced over the course of training among sign trackers, but not goal trackers, as we have reported previously (Gillis and Morrison, 2019).We found that cueinhibited cells in the NAc, on the other hand, do not follow this pattern: their reward-related responses do not diverge between sign tracker and goal tracker individuals.Thus, if there are differences in dopamine release between sign trackers and goal trackers, it does not seem to have a major impact on NAc inhibitory responses.This conclusion is in line with the observation that dopaminergic manipulations do not substantially impact NAc inhibitions, although they strongly affect excitations.Specifically, du Hoffmann and Nicola (2014) showed that blockade of either D1 or D2 receptors in the NAc led to a profound reduction in cueevoked excitations, but not inhibitions; indeed, in some cases, dopamine receptor antagonism resulted in an increase in inhibitory responding, apparently resulting from an unmasking of inhibition that had been obscured by excitatory responses.These results may also help explain our finding that NAc cue-evoked inhibitions, unlike excitations, are virtually unaffected by extinction of the cue-reward relationship.As we have previously shown (Gillis and Morrison, 2019), many neurons with cue-evoked excitations show a decrease in responding during extinction that corresponds closely to behavior.Notably, at the same time, a subset of neurons with cue-evoked excitations do not change their firing during extinction; we refer to these as "non-extinguishing" cells.In the current study, we found that nearly all cue-inhibited neurons may be categorized as non-extinguishing.Studies have shown that extinction of a Pavlovian conditioned cue correlates with reduced dopamine release in the NAc core (Sunsay and Rebec, 2014).which is likely a factor in the extinction of both behavior and NAc excitatory responses.In contrast, our data suggests that it does not affect cue-evoked inhibitory responses.Together with non-extinguishing excitatory responses, these signals might play a role in maintaining a latent association between cue and reward after extinction, contributing to such processes as spontaneous recovery and reinstatement (Todd et al., 2014).
Similarly, we found that NAc cue-evoked excitations are drastically reduced by a reward devaluation procedure, while cue-evoked inhibitions are virtually unaffected.Reward devaluation has been shown to reduce NAc dopamine release to a cue associated with the devalued reward (Aitken et al., 2016;Papageorgiou et al., 2016).Importantly, the suppression of cue-evoked excitatory signaling is apparent from the earliest trials following devaluation, and does not require the subject to experience any pairings of the cue with the devalued reward, similar to changes in behavior (Murray and Izquierdo, 2007;West and Carelli, 2016) and dopamine release (Aitken et al., 2016).Future experiments could investigate whether and when the devaluation effect on NAc excitatory cue encoding requires input from the orbitofrontal cortex (OFC) and/or basolateral amygdala; many studies have shown that these areas are essential parts of a neural circuit that links a cue to the new, lower value of its associated reward (Murray and Izquierdo, 2007;McDannald et al., 2014).
To our knowledge, only one other study has examined the impact of outcome devaluation on the activity of individual neurons in the NAc.West and Carelli (2016) trained rats on an instrumental task and used selective satiation to devalue the reward associated with a lever cue.They found that reward devaluation decreased cue representation, defined as the proportion of neurons exhibiting any cue response, in the NAc shell but not the NAc core.In the current study, we too saw only a small reduction in the number of cue-responsive neurons after reward devaluation.However, we also observed a large shift in the proportions of excitatory versus inhibitory responses: in our dataset, inhibitory responses became relatively overrepresented after reward devaluation.Surprisingly, West and Carelli (2016) also found little or no decrement in the magnitude of excitatory responses in the NAc core.There are several differences in approach that might contribute to the apparent discrepancy.Both the use of an instrumental task with multiple levers (i.e., multiple possible approach targets), and the use of a more temporary method, selective satiety, to devalue one of two possible rewards, might account for differences in cue encoding in the NAc core.Indeed, there is evidence that NAc signaling is not especially sensitive to short-term changes in cue-reward contingency (Morrison and Nicola, 2014;Morrison et al., 2017).Finally, the use of reinforcer devaluation performed outside of the task context might have contributed to relatively weak effects on signaling in the NAc core.
The behavioral effects of reward devaluation in the current study contrast with some of our earlier findings (Morrison et al., 2015;Rode et al., 2019), as well as those of others (Nasser et al., 2015;Patitucci et al., 2016;Vandaele et al., 2017), that ST behavior and/or sign tracker individuals are relatively insensitive to changes in reward value.These findings have been complicated by other reports (Derman et al., 2018;Amaya et al., 2020) showing that sign tracking can be sensitive to reward devaluation under certain circumstances.In particular, Amaya et al. (2020) reported that ST behavior was sensitive to devaluation that took place in the testing context, but not outside of it.In the current study, we used an in-context devaluation along with multiple (four to five) pairings of the reward with LiCl and found that this procedure profoundly reduced both ST and GT behavior.It is important to note that these results are not inconsistent with the idea that sign tracking is less sensitive to devaluation than goal tracking.Multiple groups have shown that GT behavior, in contrast to ST, is affected by even "weak" forms of reward devaluation, including devaluation that takes place in a different context (Morrison et al., 2015) and/or devaluation that is accomplished by selective satiation (Patitucci et al., 2016;Kochli et al., 2020;Keefer et al., 2022) rather than taste aversion conditioning.
Evidence from multiple studies indicates that sign tracking and goal tracking are the products of distinct, but partially overlapping, behavioral and neural processes.Within the NAc, ST and GT individuals show stark differences in dopamine dependence and patterns of dopamine release (Flagel et al., 2011;Saunders and Robinson, 2012), as well as differences in the encoding of reward over training (Gillis and Morrison, 2019), but these differences are observed in some neurons (cue-excited cells) and not others (cue-inhibited cells); moreover, cue-evoked responses seem to promote vigorous behavioral responding in both ST and GT modes.Further dissection of the neural circuits that support sign tracking, goal tracking, or both may shed light on their emerging relationship with drug abuse and addiction vulnerability.

Figure 1 .
Figure 1.Behavior and overview of NAc neural activity.A, PCA index for each subject calculated as an average over the last 3 d of training.Arrowhead, population mean.B, Average PCA index over 7 d of training for sign trackers (blue) and goal trackers (magenta).Error bars, SEM.C-F, Average normalized activity of neurons with an excitatory cue response (C, D; n ¼ 79) or an inhibitory cue response (E, F; n ¼ 49) around the time of cue onset (C, E) or reward delivery (D, F) on the last day of training.Activity is calculated in 10-ms bins with no smoothing.

Figure 2 .
Figure 2. Histologic reconstruction of recording locations.Panels are coronal sections(Paxinos and Watson, 2007) showing the location of electrode tips or approximate center of electrode bundles.Locations are derived from electrolytic lesions and/or electrode tracks.Numbers are distance in mm from bregma.Red, recording locations for subjects new to the current data set.All new locations represent the center of an electrode bundle.Black, recording locations for subjects fromGillis and Morrison (2019).There are disproportionately more black dots because of the use of circular arrays rather than electrode bundles.

Figure 3 .Figure 4 .Figure 5 .
Figure3.The vigor of sign tracking and goal tracking may be represented by both excitatory and inhibitory cue-evoked responses.A-D, Example of a neuron that shows a stronger excitatory response (A, B) and a neuron that shows a stronger inhibitory response (C, D) when the cue is followed by a magazine entry with faster (A, C) or slower (B, D) latency.Left panels, magazine latency .50thpercentile; right panels, magazine latency ,50th percentile.Trials are shown chronologically with the earliest trials on top.Blue dots, cue onset.Magenta triangles, magazine entry.E, F, Vigor index for latency to first action (lever press or magazine entry) for all cue-excited neurons (E; similar toGillis and Morrison, 2019; their Figure3E) and cue-inhibited neurons (F).Blue represents vigor index significantly different from 0.5 (p , 0.05, permutation test).Arrowhead indicates population median.

Figure 6 .
Figure6.Inhibitory cue-evoked responses, unlike excitatory responses, do not extinguish in concert with behavior.A, B, Representative examples of a cue-excited neuron that extinguishes its response during extinction (A) and a cue-inhibited neuron that does not (B).The two neurons were recorded from the same subject during the same session.Trials are shown chronologically with the earliest trial on top.Blue dots, cue onset.Cyan triangles, lever presses.C, D, Population average normalized activity aligned on cue onset among cue-excited cells (C) and cue-inhibited cells (D) during the first five trials (cyan/orange) or last five trials (blue/red) of an extinction session.E, F, Population average normalized activity among sign trackers (blue) and goal trackers (magenta) calculated in five-trial bins over the course of extinction for cue-excited cells (E) and cue-inhibited cells (F).Panel E is similar toGillis and Morrison (2019; their Figure 6E).For excitations, activity is from the first 1 s after cue onset; for inhibitions, activity is calculated in custom time windows for each cell.Error bars, SEM.Asterisk indicates activity is significantly different in sign trackers versus goal trackers (Wilcoxon rank-sum test, p , 0.05; dagger, p ¼ 0.08).

Figure 7 .
Figure 7. Excitatory cue responses, but not inhibitory cue responses, are sharply reduced following reward devaluation.A, Behavioral responses, averaged in five-trial bins, over the course of an extinction session that is preceded by reward devaluation (orange) or is not preceded by reward devaluation (blue).Solid lines, lever presses; dashed lines, magazine entries.Error bars, SEM.B, C, Population average normalized activity among cue-excited cells (B) or cue-inhibited cells (C), averaged in five-trial bins, over the course of an extinction session that is preceded by reward devaluation (orange) or is not preceded by reward devaluation (blue).Error bars, SEM.Asterisks indicate activity is significantly different in devalued versus nondevalued subjects (Wilcoxon rank-sum test, p , 0.05).D, E, Population average normalized activity aligned on cue onset among cue-excited cells (D) or cue-inhibited cells (E) during an extinction session.Orange, extinction session preceded by reward devaluation; blue, extinction session not preceded by reward devaluation.Shading indicates SEM.

Table 1 :
Detailed statistics for all analyses in the manuscript