A Transient Dopamine Signal Represents Avoidance Value and Causally Influences the Demand to Avoid

Abstract While an extensive literature supports the notion that mesocorticolimbic dopamine plays a role in negative reinforcement, recent evidence suggests that dopamine exclusively encodes the value of positive reinforcement. In the present study, we employed a behavioral economics approach to investigate whether dopamine plays a role in the valuation of negative reinforcement. Using rats as subjects, we first applied fast-scan cyclic voltammetry (FSCV) to determine that dopamine concentration decreases with the number of lever presses required to avoid electrical footshock (i.e., the economic price of avoidance). Analysis of the rate of decay of avoidance demand curves, which depict an inverse relationship between avoidance and increasing price, allows for inference of the worth an animal places on avoidance outcomes. Rapidly decaying demand curves indicate increased price sensitivity, or low worth placed on avoidance outcomes, while slow rates of decay indicate reduced price sensitivity, or greater worth placed on avoidance outcomes. We therefore used optogenetics to assess how inducing dopamine release causally modifies the demand to avoid electrical footshock in an economic setting. Increasing release at an avoidance predictive cue made animals more sensitive to price, consistent with a negative reward prediction error (i.e., the animal perceives they received a worse outcome than expected). Increasing release at avoidance made animals less sensitive to price, consistent with a positive reward prediction error (i.e., the animal perceives they received a better outcome than expected). These data demonstrate that transient dopamine release events represent the value of avoidance outcomes and can predictably modify the demand to avoid.


Introduction
Dopamine release events within the mesocorticolimbic pathway actively guide behavior in positive reinforcement, where behavior is strengthened by the occurrence of a rewarding event (Schultz et al., 1997;Glimcher, 2011). A growing literature shows that a transient dopamine signal represents reward value and causally modifies the valuation of reward (Schelp et al., 2017;Schultz et al., 2017). It was first demonstrated that bursts of dopamine neural activity occur when animals are presented with an unexpected reward or a reward predictive cue but are suppressed when reward is withheld. These observations led to the view that dopamine neurons represent a reward prediction signaled by the cue (Schultz et al., 1997). It was then shown that the extent of the dopamine response corresponds to predicted reward magnitude (Tobler et al., 2005;Enomoto et al., 2011;Lak et al., 2014;Stauffer et al., 2014;Schelp et al., 2017), suggesting that dopamine actually represents a value prediction signaled by the cue (Schelp et al., 2017;Schultz et al., 2017).
An extensive body of literature suggests that dopamine also plays a role in avoiding harm (i.e., negative reinforcement). Dopamine terminal lesions (McCullough et al., 1993) and dopamine receptor antagonists (Arnt, 1982;Wadenberg et al., 1990) impair avoidance of electrical footshock while genetic restoration of dopamine within the striatum of otherwise dopamine-deficient mice is necessary to maintain avoidance (Darvas et al., 2011). Microdialysis studies have demonstrated that striatal dopamine concentration is increased during the avoidance of footshock and in postsurgical pain relief (McCullough et al., 1993;Navratilova and Porreca, 2014). More recently, studies using fast-scan cyclic voltammetry (FSCV) showed that transient increases in dopamine concentration occur in response to cues signaling the opportunity to avoid electrical footshock (Oleson et al., 2012;Gentry et al., 2016). While these studies indicate a role for dopamine in avoidance, it remains controversial whether dopamine plays a role in the valuation of avoidance. Indeed, a notable electrophysiology study concluded that dopamine neurons predominately encode the value of positive reinforcement (Fiorillo, 2013). Similarly, a recent FSCV study found that transient dopamine release events encode ultrasonic vocalizations associated with a positive, but not a negative affective state, indicating that the behavior response to negatively reinforcing stimuli is regulated by distinct brain areas (Willuhn et al., 2014).
To investigate the role of dopamine in the valuation of avoidance, we used a behavioral economics task in which rats were presented with an avoidance predictive cue and provided the opportunity to avoid the onset of electrical footshock by responding on a lever. We then increased the price of avoidance by increasing the number of lever presses required to avoid footshock at fixed intervals over the course of a session. To measure changes in price sensitivity, we generated demand curves to model avoidance as a function of price. The rate at which these demand curves decay depicts price sensitivity and allows for inference of the worth an animal places on an outcome.
If, as in reward seeking (Schultz et al., 2015;, dopamine represents the value to avoid aversive outcomes, than dopamine concentration at avoidance-predictive cues should decrease as the price to avoid increases. Furthermore, optically increasing release at cue presentation should increase price sensitivity. In theory, artificially augmenting release at cue presentation would lead the animal to expect a better outcome. When the animal's expectation is negatively violated by the recurrence of the same amplitude of footshock, the animal becomes more sensitive to price because the worth of avoidance is diminished. By contrast, optically increasing release at successful avoidance should increase the demand to avoid. In this case, we infer that the worth of avoidance is increased because the animal perceives that they received a better bargain than predicted.
As hypothesized, dopamine scaled inversely with cost; however, both dopamine and avoidance were concurrently attenuated at session onset. Augmenting dopamine release at an avoidance predictive cue rendered animals more sensitive to avoidance costs. We attribute this finding to a negative reward prediction error, whereby the animal perceives they received a worse value than anticipated when they receive the same amplitude of electrical footshock. Increasing release at successful avoidance made animal less sensitive to avoidance costs. We attribute this finding to a positive reward prediction error, whereby the animal perceives they received a better value than anticipated despite still receiving footshock in other trials. From these data, we conclude that dopamine release events represent the value of avoidance and casually modify the demand to avoid.

Subjects and surgeries
Male Long-Evans rats provided by Charles River Laboratories as well as Transgenic rats (LE-Tg(TH-Cre)3.1Deis) expressing Cre-recombinase under the tyrosine hydroxylase (TH) promotor (TH::CreϮ) supplied by Rat Resource and Research Center were used as subjects. Rats were singly housed and maintained on a 12/12 h light/dark cycle with the dark cycle beginning at 10 A.M. All experiments were conducted during the dark (active) cycle with food, water, and crinkle paper enrichment provided ad libitum. Surgeries were conducted at 300 -350 g using Kopf stereotaxic equipment with rats anesthetized at 5% isoflurane and maintained at 2 Ϯ 1%. For FSCV, rats were implanted with a microdialysis guide cannula (BASi) targeted at the nucleus accumbens (NAcc) core (ϩ1.3 AP, ϩ1.4 ML) of the right hemisphere and a contralateral Ag/AgCl reference electrode. For optogenetic surgeries, rats were unilaterally transfected with 4 l of a Cre-dependent virus (rAAV2/EF1a-DIO-hChR2(H134R)-EYFP; UNC Vector Core) targeted at the ventral tegmental area (VTA). A total of 1-l viral aliquots were infused at four areas surrounding the VTA (Ϫ5.2 AP and Ϫ6.0 AP, Ϫ0.5 ML, Ϫ7.4 AP and Ϫ8.4 AP) at a rate of 50 nl/min. A fiber optic cannula (ThorLabs; 200 m core) was then implanted unilaterally, directed at the VTA (Ϫ5.6 AP, ϩ0.5 ML, Ϫ7.9 DV). These methods were used for both the transgenic rats as well as wild-type (WT) counterparts, which served as controls in optogenetic experiments. After surgery, rats were given Ͼ3 d to recover during which time each received a daily 3-ml intraperitoneal injection of a 1% carprofen solution to reduce inflammation and postoperative pain. All animal procedures were performed in accordance with the University of Colorado Denver animal care committee's regulations.

Behavioral tasks
Rats were maintained on a daily training schedule (7 d/week) within operant boxes (Med-Associates) outfitted with footshock grid floors. Rats were first trained to escape electrical footshock with each daily escape-only session lasting 15 min. At the onset of each session, both an active and inactive lever were extended, a cue light placed above the active lever was illuminated and 0.5-mA electrical current (i.e., footshock) was applied to the grid floor of the operant chamber. A response on the active lever terminated the ongoing foot shock and allowed the rat to escape into a 30-s safety period accompanied by a tone, while a response on the inactive lever had no effect. This safety signal played for the entirety of the safety period, the end of which coincided with the onset of the next escape trial. In this initial training task, escape behavior was shaped by a researcher who could simulate a lever response using a wireless keyboard, thereby reinforcing a series of behaviors that would lead to the acquisition of operant escape. The experimenter first shaped the animal to the quadrant of the operant box containing the extended lever. They then reinforced rearing behavior in front of the lever before finally requiring the rat to respond on the lever to self-terminate electrical footshock. This training persisted until the animal demonstrated an association between lever response and shock termination through consistent and researcher-independent escape (Ͼ20 sequential escape responses).
Following the acquisition of escape behavior animals were moved into a daily 1-hr avoidance task in which reinforcement was maintained under a fixed ratio1 (FR1) schedule of reinforcement. This task added the potential for an avoidance outcome. Each session was initiated by the extension of both an active an inactive lever and the illumination of a cue light placed directly above the active lever. In each trial, rats were given 1s from cue presentation to respond on the active lever to successfully avoid footshock. If rats failed to respond, recurrent footshock (0.5 s, 0.5 mA every 1 s) was applied until a lever response was made to escape further shock. Responses made on the inactive lever had no effect on behavioral outcome. Rats remained on this task until they consistently per-formed (more than or equal to three sessions) Ն50% avoidance.
After successful acquisition of 50% avoidance under an FR1 schedule, rats moved into a behavioral economicsbased shock avoidance task. Here, the unit-price (response requirement/mA shock avoided) increased throughout each session by increasing the response requirement to both avoid or escape footshock. Within this task, the unit-price epoch duration increased in length to allow 20 avoidance opportunities and provide sufficient time to meet increasing response requirements as well as to account for safety period duration following avoidance or escape responses. (Fig. 1A, column 4). Failure to meet the response requirement on the active lever within the allotted time ( Fig. 1A-C, column 3) resulted in the onset of a 0.5 s, 0.5-mA footshock and reset any lever responses made before footshock onset to zero. Footshock recurred until the response requirement was fully met or 15 sequential footshock were received and the session terminated. Similarly, 20 sequential escape responses resulted in session termination. Rats were first trained to perform multiple lever presses to avoid footshock across six unitprices (Fig. 2B). Once animals demonstrated independent multiple lever press responding, they were moved into an extended, 16 unit-price economic task until they acquired stable response output across daily sessions (Fig. 2C). Acquisition was defined by rats reaching a final unit-price (within a range of three price points) for three consecutive days, without showing any ascending or descending trend (Fig. 3D,E).
Unit-price was additionally manipulated through changes in the mA of shock received. Within this task, the response requirement was held constant at an FR2 and rats were given 2 s to respond before the onset of a 0.5-s footshock across six unit-prices (Fig. 1A, upper). Within this task animals were presented with descending unit-prices (ascending mA shock) to prevent avoidances in higher unit-prices due to elevated mA shock amplitude in lower unit-prices. Safety period duration was additionally modulated to address opportunity cost confounds in the response requirement task and unit-price epoch duration was similarly modulated to allow for 20 avoidance opportunities at each unit-price. Utilizing this task, response output failed to scale in response to changing unit-price (Fig. 1A, lower).
We further attempted to address order effect confounds within the ascending response requirement task by randomizing the presentation of five unit-prices within a single session (Fig. 1, upper). Randomization of unitprice resulted in inconsistent behavioral output and preemptive extinction of avoidance (lower).
Within all tasks, the cue was defined as the presentation of a lever and the illumination of a light above the lever which was also accompanied by white noise. Both an escape or avoidance response resulted in the retraction of the lever, the dimming of the light, termination of the white noise, as well as the beginning of the 30-s safety tone and illumination of a secondary house light for the duration of the safety period.

FSCV
FSCV recordings (n ϭ 7) were performed during the economics-based task. For these recordings, glass carbon-fiber electrodes were lowered into the NAcc using micromanipulators (University of Illinois at Chicago; Schmidt) and locked into place at a depth where transient dopamine release events were apparent. Electrodes were first cycled at 60 Hz in vivo for 30 -40 min before being reduced to 10 Hz for data collection. An initial wave form (Ϫ0.4 to 1.3 V, tarheelCV filtered with cutoff frequency of 2 kHz for a scan rate of 400 V/s) was applied which allowed for the detection of dopamine via FSCVs taken every 100 ms. To extract the dopamine component, principle component regression (PCR) was applied to the raw voltammetric data as previously described (Heien et al., 2004). Specifically, DA and pH were resolved from the FSCV recordings using recording-specific training sets (n ϭ 7/anylate) to produce pH background-subtracted (10 consecutive scans) dopamine concentration files. To increase the validity of calibration factors for dopamine assessment, we applied a recently developed computational model (Roberts et al., 2013) designed to calculate calibration factors for individual electrodes by applying known constants to background current values from each in vivo recording. By replicating Roberts et al. (2013), using 10 electrodes, we obtained a set of empirical values using multiple linear regression analysis. Our lab-specific coefficients are: ␣ ϭ 4.71e Ϫ5 , ␤ ϭ 17.185, ␥ ϭ 8.324, ␦ ϭ Ϫ0.656. Using these coefficients, we can calculate calibration factors for individual electrodes used in vivo by simply entering the observed total background current and the switching potential used for each individual recording. For additional information see the supplemental information of Schelp et al. (2017).
We used different approaches to analyze dopamine concentration at the avoidance predictive cue versus throughout the safety period. The concentration of dopamine at the presentation of the cue preceding avoidance was defined as the peak concentration Ϯ0.5 s surrounding the event. Avoidance predictive cue-associated dopamine concentrations from individual trials were aggregated, split according to avoidance or escape outcome, and averaged across all animals per unit-price for analysis (Figs. 4A,C). Voltammograms which contained excessive electrical noise were excluded from this analysis.
As previously described (Oleson et al., 2012), dopamine concentration begins to increase before cue presentation in tasks where the animal can anticipate the timing of its presentation. Because varying the duration of the safety period is known to alter avoidance (Sidman, 1953), and theoretically would alter the valuation of avoidance, we opted to use a fixed, 30-s safety period for the behavioral economics task. As such, it should be noted that the dopamine concentration at the warning signal likely results from both anticipation of the warning signal its actual presentation.
The pattern of dopamine release during the safety period was distinct as this period is not transient; rather, each safety period lasted 30 s in duration. Thus, to more accurately assess dopamine transient activity during the safety period, rather than analyzing mean dopamine concentration at the onset of the 30-s safety signal we performed an analysis of the amplitude of individual transient events throughout this period. A custom MATLAB program previously described by Schelp et al. (2018) was used to fit a polynomic line to individual 30-s safety period dopamine concentration traces. For each individual safety period trace a polynomic line shift down 1/3 a SD was fit Figure 1. Unit-price randomization and manipulation of unit-price through changing mA shock avoided prevents establishment of baseline avoidance performance during a within session design in which unit-price epoch duration was modulated to allow 20 avoidance opportunities at each unit-price. A, The first six unit-prices, achieved through decreases in the mA of shock avoided (upper), were presented in descending order to animals and avoidance behavior was assessed. Representative avoidance with mA shock manipulation demonstrates variability and failure of avoidance to scale with increasing price (lower). B, Ascending unit-price through increase in response requirement (upper) demonstrates a unique, but stable avoidance demand curve with increasing unit-price (lower). C, The first five unit-prices, achieved through increasing response requirement (upper), were randomly presented to animals, and avoidance behavior was assessed; representative avoidance behavior demonstrates variability between sessions and failure to establish a baseline of avoidance (lower).
to the dopamine concentration trace as baseline. This SD was then multiplied by 3 and added to the baseline polynomic line to generate a first fit threshold line. Any transients which reached above this threshold line were considered true transients and these concentrations were averaged across each unit-price for both avoidances and escapes ( Fig. 5C-E). Notably, this program differed from Schelp et al. (2018) through the calculation of SD. Within this program, a region of 2-3 s without dopamine was declared as a baseline to determine the SD of the background signal. Dopamine during the safety associated cue from individual trials were aggregated, split according to avoidance or escape outcome, and averaged across all animals per unit-price for analysis ( Fig. 6A,C). Voltammograms that contained excessive electrical noise were excluded from this analysis.

Fitting demand curves to avoidance
Demand curves depict the relationship between the consumption of a commodity (in this case avoidance) and changing unit-price. Because demand decreases or decays with increasing price, the desirability of a given commodity may be inferred based on the relative rate of decay of a demand curve. More rapidly decaying demand curves indicate lower demand, or heightened sensitivity to increasing price, while more gradually decaying demand curves indicated greater demand for a given commodity, or reduced sensitivity to increasing price. to generate demand curves depicting avoidance as a function of increasing price. In order to generate demand curves depicting avoidance as a function of increasing price, the avoidance responding observed within the task was fit to an exponentially decaying model though a custom MATLAB script using nonlinear least squares. Within this analysis, total avoidance per unit-price epoch was graphed on the y-axis against unit-price on the x-axis.
Analysis revealed that exclusion of unit-price 2 resp/ mA, at which attenuated avoidance responding was observed, generated lower X v 2 values and, therefore, was not included in these fits. Within this equation, the variable ␣, which describes the rate of decay of the avoidance demand curve, was used to contrast changes in price sensitivity across optogenetic manipulations (Fig. 7).

Figure 2.
Shock avoidance paradigm and behavioral economics methodology. A, An avoidance predictive cue, characterized by the presentation of a lever and the illuminator of a cue light, is presented to signal the opportunity for an animal to respond on the lever to avoid. If a response is made within a set interval (B, C, column 3), the animal does not receive footshock. This is considered an avoidance (blue). If the animal fails to meet the response requirement with a set interval, the animal receives recurrent footshock until the full response requirement is met and the animal escapes further shock (red). B, To increase the cost of avoidance, unit-price (column 1: defined as response requirement/mA shock) increases through an increase in response requirement (column 2). Animals were initially presented six unit-prices to train multiple lever press responding with additional time provided to meet higher response requirements before shock onset (column 3). The epoch length increases in duration to allow for 20 avoidance opportunities at every unit-price (column 4). C, On acquisition of multiple lever press responding, animals were moved to the economic based task wherein 16 ascending unit-prices were presented. Here, seconds to respond before shock onset (column 3) was restricted relative to the six unit-price training task. D, Timeline depicting animals' progression through active operant avoidance training. A total of 38 animals were first trained to respond to escape unavoidable footshock. The 37 animals that fully acquired escape behavior moved into a FR1 avoidance task until rats consistently (more than or equal to three sessions) performed avoidance at Ն50% avoidance performance. On acquisition of avoidance behavior, animals were trained to meet increasing response requirements across six unit-price epochs. Of the 37 animals introduced to this task, 26 demonstrated avoidance at more than or equal to three unit-prices, taking an average of 13.9 Ϯ 3.1 sessions to acquire, and proceeded to the next task. Animals were moved into an economic footshock avoidance task with 16 ascending unit-prices. Here, seconds (column 3) to respond before shock onset was restricted relative to the training task. On establishing a stable, baseline rate of avoidance (average, 19.5 Ϯ 2.9 sessions), animals were placed either in the optogenetics or FSCV group.

Optogenetic manipulation
To allow sufficient expression of the channelrhodopsin-2 (ChR2) protein, both the transgenic TH-Cre group (n ϭ 10) as well as the WT control group (n ϭ 10) were given 30 d following surgery before optogenetic stimulation. Both groups received intracranial blue light (473 nm) delivered at 10 pulses, 20 Hz, 0.5-s duration from a laser (optoengine) controlled by a custom Arduino system (Ng-Evans). Laser output was determined through the Stanford brain transmission calculator to produce a 1mm cone of light of 1 mW/mm 2 within the brain tissue with a 15-mW output from the ferrule cannula tip and was designed to encompass exclusively the VTA dopamine cell bodies. Stimulation was unilaterally applied to the right hemisphere in all animals. These stimulation parameters have previously been used within the lab and have demonstrated successful augmentation of dopamine within the NAcc (Schelp et al., 2017). Within the economic based avoidance task, rats first established a baseline rate of avoidance across three sessions. Following baseline performance, in a counterbalanced design, rats received optogenetic stimulation for three sessions either at the presentation of the avoidance predictive cue, or when the animal fully met the response requirement and successfully avoided footshock (referred to as "cue" and "avoidance" stimulation, respectively). Following cue or avoidance stimulation, rats reestablished baseline behavior across three sessions, after which the animals received stimulation under the complementary paradigm (cue or avoidance) across three sessions (Fig. 8A). Behavior was averaged and contrasted according to baseline, cue, or avoidance paradigm in both the TH-Cre and WT group (Fig. 9).

Behavioral attenuation
The initial attenuation of behavior (Fig. 10A) and the cue associated dopamine concentration at 2 resp/mA were additionally analyzed. Cue associated dopamine concentration were measured for individual trails at both 2 resp/mA and 4 resp/mA. Averaged concentrations at these unit-prices were contrasted for both avoidance and escape outcomes (Fig. 10B,C). Total avoidance at 2 resp/mA was further analyzed in response to optical stimulation at the cue and upon successful avoidance (Fig.  10D).

Locomotor assessment
To assess for locomotor changes as a result of optogenetic stimulation, transgenic TH-Cre rats (n ϭ 8) were placed in locomotor chambers and their movement was tracked using Med-Associates Activity Software. Rats first acclimated to the chambers over the course of 1 h before immediately moving into either a 1-h "stimulation" paradigm wherein they received optogenetic stimulation every 30 s, paralleling the frequency of stimulation animals receive within the economic based avoidance task, or a "baseline" paradigm wherein rats attached to a mock fiber optic patch cable received no stimulation. These conditions were counterbalanced against each other and total distance moved every 5 min was analyzed for both conditions (Fig. 11D).

Histology
On completion of optogenetic experiments, animals were anesthetized with a ketamine xylazine mixture and transcardially perfused with 0.01 M PBS followed by a 4% paraformaldehyde solution. The brains were harvested and rested in a 4% paraformaldehyde solution for 24 h before soaking in a 30% sucrose solution for 48 h, after which the neural tissue was then frozen and kept at Ϫ80°C. The frozen neural tissue was sectioned in coronial in 50-m slices and floated in a 0.01 M PBS/15% normal donkey serum solution for 24 h at 4°C. Tissues was washed with a 0.1% Tween 20/0.01 M PBS for 5 min followed by three 5-min washes of 0.01 M PBS. Tissue was then floated in a 0.1% solution of Immunostar TH primary antibody and 0.01 M PBS for 48 h at 4°C, after which they were washed with Tween 20 and PBS as described above. Following washes, tissue was floated in a 0.1% Alexa Fluor 647 donkey anti-mouse IgG secondary antibody and 0.01 M PBS at 4°C for 12 h, after which the tissue was washed once more with Tween 20 and PBS. Once washed, tissue was mounted on slides with Vector Vectasheild hardset mounting media with DAPI and imaged to identify expression of ChR2, TH, and DAPI in both the NAcc and VTA (Fig. 11A,B).
On completion of voltammetry experiments, animals were sacrificed using CO 2 . With a micromanipulator, a stainless-steel electrode was lowered to the depth of the recording site and a current was applied to electrically lesion the region. Following lesioning, the brains were extracted, frozen, and stored at Ϫ80°C. Tissue was sliced at 50 m and dry mounted on slides. The mounted tissue was submerged in 95% ethanol/deionized (DI) water for 15 min followed by 1-min submersions in 70% ethanol, 50% ethanol, and two washes in DI water. Tissue was then soaked in Crysal Violet for 1 min, after which it was bathed in two 1-min washes in DI water followed by 15-s washes in 50%, 70%, 95%, and 100% ethanol. This was followed by a Ն8-min wash in Histo-Clear. Once complete, slides were mounted with Paramount and imaged to confirm lesion placement (Fig. 11E).

Statistics
All statistics were performed using SigmaPlot11. First, Shapiro-Wilk was used to assess for normality and Brown-Forsythe was used to assess for equal variance. If these tests passed, ANOVA were used; if these tests failed, equivalent non-parametric statistics were used.

Relationship between alpha and price elasticity of demand
Demand curves are a common tool used by economists to measure price sensitivity. Demand curves depict the relationship between consumption of a commodity (in this case the avoidance of harm) and price. Demand curves usually show a negative gradient (i.e., the law of demand), where consumption decreases with increasing price. The rate at which the negative slope decays can be used to make inferences regarding the value individuals place on the commodity being consumed. When demand curves decay at a faster rate they are said to be more elastic. Price elasticity of demand is defined as the change in the quantity of the commodity being consumed in response to an increase in price. In the present study, the number of successful avoidance responses at each price represents the quantity of the demanded commodity. We measure the elasticity of demand by computing the variable ␣, which represents the cost at which the elasticity of demand is exactly Ϫ1, meaning consumption drops by one percentage in response to a one percentage increase in price. A higher ␣-value indicates the demand for the good is more elastic, suggesting the value of the commodity is diminished. In contrast, a lower ␣-value indicates the demand for the good is relatively inelastic, suggesting the value of the commodity is enhanced.

A behavioral economics task was used to investigate the role of dopamine in the valuation of avoidance
To assess the role of mesocorticolimbic dopamine release events in the valuation of avoidance, we used a behavioral economic-based shock avoidance task. Our approach both builds on previous within session designs that characterized the demand for sucrose and cocaine (Oleson and Roberts, 2009;Schelp et al., 2017) as well as expands on a recent between sessions approach that was the first to apply behavioral economics to negative reinforcement (Fragale et al., 2017). As a within session design is more conducive to neural monitoring, we began by exploring the effects of increasing the unit-price (response requirement/mA shock avoided) of avoidance within individual sessions. We first sought to increase unit-price by manipulating the mA shock avoided (1.0 -0.13 mA) across fixed epochs. To accomplish this, a range of six unit-prices were presented in descending order (Fig. 1A) and avoidance was analyzed as a function of unit-price. Avoidance failed to consistently scale with price when price was manipulated by changing shock amplitude across within session epochs. Next, we sought to manipulate unit-price by increasing the response requirement to avoid across within session epochs. As would be predicted by the law of demand, this task engendered a stable pattern of behavior hallmarked by a negative price elasticity, meaning that the demand to avoid decreased with increasing price (Fig. 1B). However, we also observed an initial attenuation of behavior at session onset which is distinct from previous reports investigating the demand to obtain sucrose (Schelp et al., 2017) or cocaine (Oleson and Roberts, 2009). To address order effects, we attempted to randomize the order in which the response requirements were presented (Fig 1C,  upper). As was previously reported for sucrose demand (Schelp et al., 2017), animals showed aberrant response patterns with randomization of unit-price (lower). Thus, we proceeded with a task in which the unit-price was manipulated by increasing response requirement and presented prices in a consistent ascending order. Finally, to ensure that all animals reached a price at which they failed to sustain demand, the total number of unit-prices was increased from 6 to 16.
Within the selected ascending response requirement task, a compound cue signaled the opportunity to respond on a lever in order avoid an unchanging electrical footshock (0.5 s, 0.5 mA). Failure to meet the response requirement within a set timeframe resulted in recurrent footshock until the full response requirement was met. This outcome is defined as escape. Successfully meeting the response requirement within a set timeframe resulted in a safety period signaled by a tone without the occurrence of footshock. This outcome is defined as avoidance. Following either avoidance or escape, a tone accompanied by house light illumination signaled a 30-s safety period (Fig. 2A). The end of each safety period coincided with the onset of the next trial, starting with presentation of the avoidance predictive cue. On acquisition of avoidance under a FR1 reinforcement schedule, animals were introduced to unit-price manipulation through an increase in response requirement across discrete, time-based epochs in each daily session. Rats were initially trained to perform multiple lever press responding across six unit-prices (Fig. 2B). Unit-price epoch duration was modulated to allow for a maximum of 20 avoidance opportunities within each unit-price (column 4) and additional time was provided to rats to meet increasing response requirement (column 3). On acquisition of multiple lever press responding rats were placed in an economic based task with 16 ascending unit-prices (Fig. 2C). Once animals were successfully trained to escape, then avoid at Ն50% avoidance on an FR1 schedule, animals took an average of 13.9 Ϯ 3.1 sessions to acquire multiple lever press responding and 19.5 Ϯ 2.9 sessions within the economic task to establish stable baseline behavior (Fig.  2D). On establishment of baseline behavior, animals entered optogenetic and FSCV experimentation.
While data were primarily analyzed in terms of successful avoidance outcomes at each unit-price, note that the additional delay required to meet increasing response requirements adds a conceptually important opportunity cost (cf. Fig. 2B,C, columns 2 and 3). Thus, total cost to the animal results from both effort and opportunity costs.

Unique avoidance demand curves were generated to measure price sensitivity
This task allowed us to generate within session demand curves by plotting total avoidance responses per epoch against unit-price. Unique demand curves emerged over the course of acquisition of the economic task, with an attenuation of avoidance at 2 resp/mA becoming apparent in experienced animals (Fig. 3A). The ratio of total avoidance at 4 versus 2 resp/mA significantly increased over the course of training (one-way RM ANOVA: F (2,44) ϭ 9.87, p Ͻ 0.001; Tukey post hoc: start of training vs end of training p Ͻ 0.001; Fig. 3B). Similar economic-based food-seeking tasks do not demonstrate this attenuation at session onset (Schelp et al., 2017; Fig. 3C), suggesting that additional inhibitory neural circuitry might be recruited during economic assessments of avoidance . The ratio of avoidance 4 resp/mA versus attenuated avoidance at 2 resp/mA in both a transgenic TH-Cre and WT group develops and stabilizes over the course of training (Fig. 3D). While a measure of demand, ␣, similarly stabilizes over the course of training in both the TH-Cre and WT group (Fig. 3E), suggesting this attenuation might be a learned suppression that develops across repeated sessions, each of which terminates when the animal receives 15-20 consecutive footshocks.

Dopamine release events correspond to value associative cues in avoidance
To evaluate the role of NAcc dopamine during the valuation of avoidance, we employed FSCV to measure transient changes in dopamine concentration time locked (Ϯ0.5 s) to associative cues in the behavioral economicsbased task. We exclusively analyzed dopamine concentration within the second through fourth unit-price because, 100% of animals maintained responding in this range and a concurrent attenuation in behavior and dopamine concentration was observed at the first price point. We independently address these attenuated responses later in the manuscript. The nA changes in current were converted to nM concentration by using PCR and lab-specific computational factors (see Materials and Methods). Dopamine concentration files were arranged around one of two events: (1) avoidance predictive cue and (2) safety associated cue. Dopamine at the avoidance predictive cue was quantified and analyzed as a group mean because the mean data were representative of individual trials (cf. Fig. 4A, inset, vs B,D). Dopamine during the safety period was quantified and analyzed at the level of the individual transient because mean data were not representative of individual trial (cf. Fig. 5C vs 5A,B).

Dopamine at the avoidance predictive cue
In successful avoidance trials we found that dopamine concentration at the avoidance predictive cue scaled inversely to unit-price [one-way repeated measures (RM) ANOVA: F (2,20) ϭ 29.507, p Ͻ 0.001; Tukey post hoc: unit-price 4 vs 6 p Ͻ 0.001, 4 vs 10 p Ͻ 0.001; Fig. 4A]. A representative avoidance trial (Fig. 4A, inset) demonstrates individual trial similarity to mean dopamine concentration plots for all animals. Average color plots demonstrate these avoidance cue trends at each unitprice (Fig. 4B, bottom) with corresponding concentration traces (Fig. 4B, top). Contrary to our predictions (Oleson et al., 2012), in escape trials we found that dopamine concentration also showed an inverse relationship to unitprice (one-way RM ANOVA: F (2,20) ϭ 3.916, p ϭ 0.049; Tukey post hoc: unit-price 4 vs 10 p ϭ 0.040; Fig. 4C). Within escape trials, dopamine concentration did not, however, demonstrate sensitivity to number of shocks subsequently received (one-way RM ANOVA: F (3,23) ϭ 2.14; p ϭ 0.14; Fig. 4C, inset). As would be predicted (Oleson et al., 2012), an increase in dopamine concentration was observed before cue presentation. Given that the safety period duration preceding cue presentation is consistent (30 s) is it possible that this preemptive increase in dopamine reflects anticipation of the cue. As such, it should be noted that the dopamine concentration at the Figure 3. Attenuation of avoidance at 2 resp/mA develops during acquisition of the economic shock avoidance task. A, As animals acquired this economic shock avoidance task (compare Fig. 1C), an attenuation of avoidance at the lowest unit-price (2 resp/mA) developed relative to responding at the second unit-price (4 resp/mA). B, The ratio between avoidance at 4 resp/mA and avoidance at 2 resp/mA significantly increased as animals established baseline rates of performance. C, Demand curves depicting economic food seeking does not demonstrate attenuation of consumption at session onset. D, The ratio of avoidance at 4:2 resp/mA increases and stabilizes over the course of training in both the TH-Cre and WT group. E, ␣, a measure of avoidance demand, similarly stabilizes over the course of training. Error bars are mean Ϯ SEM.
warning signal likely results from both anticipation of the warning signal its actual presentation. However, varying the duration of the safety period is known to alter avoidance (Sidman, 1953), and theoretically would alter the valuation of avoidance, thus it is necessary to use a fixed, 30-s safety period for the behavioral economics task. Together, these data demonstrate that, as in reward seeking, dopamine concentration at an avoidance predictive cue generally scales with price.

Dopamine at the safety-associated cue
We further analyzed the amplitude of dopamine transient events during the 30-s safety period following both avoidance and escape outcomes. Unlike dopamine concentration transients time locked to cue presentation, averaged safety period color plots (bottom) and corresponding concentration traces (top) following avoidance (Fig. 5A) and escape (Fig. 5B) are not representative of individual safety period trials (Fig. 5C). Therefore, to analyze safety period transients, dopamine concentration traces from individual trials were fit with a baseline polynomial line (Fig.  5D, red) and a SD was added to the baseline fit to generate a cutoff (Fig. 5D, blue). The baseline was set to zero and concentrations above the first fit line were analyzed as transients (Fig. 5E). Analysis of individual safety period trials revealed an inverse relationship between average transient amplitude and unit-price following successful avoidance (one-way RM ANOVA: F (3,27) ϭ 3.50, p ϭ 0.037; Tukey post hoc: unit-price 2 vs 10 p ϭ 0.029; Fig. 6A). This trend is demonstrated with representative color plots (Fig. 6B, bottom) and corresponding concentration traces (Fig. 6B, top). Conversely, concentration directly following escape demonstrated no relationship to unit-price (one-way RM ANOVA: F (3,27) ϭ 0.902; p ϭ 0.460; Fig. 6C,D). These data suggest that dopamine only represents the valuation of avoidance during the safety period following successful avoidance.

Modeling the elasticity of demand in avoidance
To measure changes in price sensitivity, total avoidance versus unit-price was fit with an exponentially decaying model: with the variable ␣ describing the rate of decay, Q max depicting avoidance at zero price, Q min depicting minimum avoidance, and C representing unit-price. Notably, ␣ is inversely related to avoidance performance, with lower ␣ values signifying higher avoidance while larger ␣ values reflect lower avoidance (Fig. 7A). As previously noted (Fig.  3A), an initial attenuation in demand occurs at 2 resp/mA. Similarly to the loading phase of cocaine self-administration (Oleson et al., 2011;Bentzley et al., 2014), this initial attenuation may be influenced by additional variables other than price. Thus, we investigated whether removing the first data point resulted in better-fitted demand profiles. To account for differences in the degrees of freedom between fitting 16 and 15 unit-prices, the reduced 2 (X v 2 ) was calculated with both the inclusion and exclusion of the lowest unit-price. It was revealed that the exclusion of the lowest unit-price yielded significantly lower X v 2 (included: 0.178 Ϯ 0.0059; excluded: 0.104 Ϯ 0.0053; Man- X v 2 , we opted to analyze demand profiles generated in the avoidance task after removing the first data point, as is done when analyzing demand profiles of cocaine selfadministration (Oleson et al., 2011). This approach, com- Figure 6. Dopamine during the 30-s safety period following avoidance, but not escape, scales as a function of unit-price. A, The average concentration of accumbal dopamine transients following successful avoidance decreases with increasing unit-price, while the concentration of dopamine following escape did not change as a function of unit-price (C). Representative color plots (bottom) and dopamine concentration traces (top) depict these trends at the first four unit-prices following avoidance (B) and escape (D). Error bars are mean Ϯ SEM.

Figure 7.
Modeling demand for avoidance. A, An exponentially decaying model fit successful avoidances out of 20 potential cue/response parings as a function of unit-price. Here, the variable ␣ is inversely related to the rate of decay or elasticity of avoidance demand, allowing for a direct assessment of motivation to avoid. B, The attenuation of avoidance at session onset (compare Fig. 3A) necessitated the exclusion of the lowest unit-price from economic modeling. The reduced 2 (X v 2 ) generated from the exclusion versus inclusion of the lowest unit-price rationalized this exclusion. C, The fits garnered from this equation revealed relatively high R 2 values (0.92 Ϯ 0.005). Error bars are mean Ϯ SEM. bined with optogenetics, allowed us to investigate how dopamine neurons causally modify economic demand in avoidance.

Dopamine release events causally modify demand for avoidance
Optogenetics can be used to assess the causal relationship between patterns of neural activity and behavior (Steinberg et al., 2013). Thus, we next sought to investigate how optically increasing dopamine release through selective optogenetic activation (10 pulses, 20 Hz, 0.5-s duration) of ChR2 expressing dopamine neurons of the VTA alters demand for avoidance. While future studies are necessary to parse the role of dopamine release at its various terminal projections, we opted to begin by augmenting mesoscorticolimbic dopamine release at its origin. Initially, animals from both experimental TH-Cre and WT control groups were trained to respond on a lever to terminate and escape footshock. On acquisition of escape behavior animals were moved into an FR1 avoidance task until Ͼ50% avoidance was reached and maintained. Subsequent to the establishment of Ͼ50% avoidance behavior, animals were trained to respond multiple times on the lever to avoid footshock across six unit-prices. On acquisition of multiple lever presses, animals were trained on the economic based shock avoidance task until a baseline rate of avoidance was established across three sessions.
After establishing stable baseline behavior, we provided unilateral optogenetic stimulation of the VTA cell bodies of the right hemisphere at either the presentation of the avoidance predictive cue or on successful avoidance. All animals were tested under both stimulation conditions. We investigated the effects of optical stimulation at the first event for three sessions followed by reestablishment of baseline behavior and then stimulation under the complimentary event. The order of stimulation (avoidance predictive cue vs successful avoidance) was counterbalanced within both the TH-Cre group and the WT group (Fig. 8A). The number of successful avoidances out of 20 avoidance opportunities within each unit-price was fit with an exponentially decaying model (Fig. 7A) and optogeneticinduced changes in demand were assessed by comparing ␣ values to baseline values. Representative cumulative response records, response-price curves and corresponding demand curves from a single TH-Cre animal depict the resulting patterns of behavior across all stimulation conditions (Fig. 8B-E).
Analysis of the TH-Cre group revealed that stimulation at the avoidance predictive cue significantly increased ␣ (decreased demand to avoid) while stimulation at successful avoidance significantly decreased ␣ (increased demand to avoid). Cue associated baseline and avoidance associated baseline were not significantly different (Friedman RM ANOVA on ranks: X 2 (3) ϭ 17.88, p Ͻ 0.001; Figure 8. Representative behavior from a single TH-Cre animal. A, Cue and optogenetic augmentation of VTA dopamine cell bodies was counterbalanced across both the TH-Cre and WT groups. Following the establishment of baseline across three behavioral sessions, both groups initially received optical stimulation at either the presentation of the cue (left) or on successful avoidance (right) for three sessions. On completion of the first stimulation paradigm, animals reestablished behavior over three sessions followed by stimulation at the complementary paradigm for three sessions. B, C, Successful avoidances out of 20 potential avoidance opportunities from a representative TH-Cre animal were fit with the exponentially decaying model during cue, avoidance and baseline conditions. D, Avoidance lever responses as a function of unit-price. E, Cumulative avoidance record.

Dopamine is concurrently attenuated with avoidance at session onset
We additionally sought to analyze the attenuation of avoidance at 2 resp/mA (Fig. 10A). Dopamine time locked to cues (Ϯ0.5 s) preceding avoidance at 2 resp/mA is significantly lower than cue evoked dopamine at 4 resp/mA (Fig. 10B). Averaged color plots (bottom) and dopamine concentration traces (top) depict these trends. Conversely, dopamine time locked (Ϯ0.5 s) to cues preceding escape at 2 resp/mA was greater than at 4 resp/mA (Fig.  10C). Averaged color plots (bottom) and dopamine concentration traces (top) demonstrate this trend. Optical stimulation of dopamine neurons similarly influences the attenuation in demand observed as session onset. In the TH-Cre group, optical stimulation at successful avoidance, but not at the avoidance predictive cue, reduced attenuation relative to baseline performance (one-way RM ANOVA: F (3,39) ϭ 3.76, p ϭ 0.022; Tukey post hoc: baseline-1 vs cue p ϭ 0.28, baseline-2 vs avoid p ϭ 0.042). Conversely, the WT group demonstrated no change in attenuated avoidance in response to optical stimulation (one-way RM ANOVA: F (3,39) ϭ 1.00, p ϭ 0.41; Fig. 10D). Our electrochemistry data demonstrate that dopamine at the avoidance predictive cue is concurrently attenuated with behavior at session onset in this task. We believe this attenuation at session onset is a learned suppression that develops following repeated training sessions in which sessions terminate with repeated electrical footshock. Our optogenetics data reveal that dopamine release is sufficient to rectify attenuation of avoidance demand at session onset. Together, these observations support the view that heightened dopamine release can overcome the

Additional experimental controls and considerations
On completion of optogenetic experiments, neural tissue was harvested and expression of ChR2 in the VTA and NAcc was confirmed in all TH-Cre animals (Fig. 11A,B). Optical ferrule cannulae, which were assessed before and after surgical implantation, demonstrated no significant decrease in retention (paired two-tailed t test: t (7) ϭ 1.59, p ϭ 0.15; Fig. 11C). to address potential locomotor confounds produced by optical stimulation, recurrent optogenetic stimulation of VTA dopamine neurons was conducted within the TH-Cre group during an open field test. This augmentation failed to increase horizontal activity relative to baseline (paired two-tailed t test: t (7) ϭ Ϫ0.363, p ϭ 0.727), suggesting changes in overall activity played a minimal role in our demand profiles. On completion of the voltammetric experiments, animals were sacrificed and electric lesions at the depth of the recording site confirmed electrode placement within the NAcc (Fig.  11E).

Discussion
We investigated the role of mesocorticolimbic dopamine release events in the valuation of avoidance using a combination of behavioral economics, modeling, electrochemistry, and optogenetics. After an initial attenuation of dopamine release and behavior at session onset, accumbal dopamine concentration scaled inversely with price during the active avoidance of signaled footshock. Dopamine at the warning signal decreased with price irrespective of the outcome (avoidance vs escape); however, dopamine only decreased with price at the safety associated cue following successful avoidance. We interpret these distinct outcome-specific responses to suggest that dopamine concentration before the behavioral action was predictive of value, whereas dopamine concentration following action was reflective of the outcomes value.
We next sought to assess the causal role of dopamine in the valuation of avoidance within an economic framework. A central principle of economics is that as the price of a commodity increases, its demand decreases. This principle is commonly demonstrated using demand curves, in which consumption of a commodity is plotted against price. According to the law of demand, consumption typically decreases with increasing price. The resulting negative gradient represents the elasticity of demand, or how sensitive consumption of the commodity is to increasing price. In the present study, consumption is defined as the number of successful avoidance responses at each price (responses/mA). If avoidance became more sensitive to price, the resulting demand curve would decay at a faster rate; if avoidance became less sensitive to price, the resulting demand curve would decay at a slower rate. We can then make inferences regarding the value of avoidance by measuring the rate of decay. We compute the rate at which demand curves decay by solving for the variable ␣, which represents the cost at which the elas- . Dopamine scales with avoidance attenuation at session onset and optical augmentation of dopamine attenuation at 2 resp/mA. A, Avoidance at the 2 resp/mA is attenuated relative to avoidance at 4 resp/mA. B, Dopamine concentrations at 2 resp/mA avoidance cues are significantly less than dopamine at 4 resp/mA avoidance cues. Averaged color plots (bottom) with corresponding dopamine concentration traces (top) depict these trends. C, Dopamine concentrations at 2 resp/mA escape cues are significantly less than dopamine at 4 resp/mA escape cues. Averaged color plots (bottom) with corresponding dopamine concentration traces (top) depict these trends. D, Avoidance stimulation, but not cue stimulation, altered avoidance at 2 resp/mA within the TH-Cre group (left), but not within the WT group (right). Error bars are mean Ϯ SEM.
ticity of demand is exactly Ϫ1. At this point, consumption drops by one percentage in response to a one percentage increase in price. A higher ␣-value indicates the demand for the good is more elastic, suggesting the value of the commodity is diminished. In contrast, a lower ␣-value indicates the demand for the good is relatively inelastic, suggesting the value of the commodity is enhanced. In accordance with our previous work assessing the role of dopamine in the valuation of a sugar reward (Schelp et al., 2017), we predicted that increasing dopamine release at an avoidance predictive cue would decrease ␣, whereas increasing release at successful avoidance would increase ␣.
Optically stimulating dopamine cell bodies in the VTA at an avoidance predictive cue rendered animals more sensitive to price (i.e., ␣ increased), consistent with a negative reward prediction error. We infer that heightened release at cue presentation signaled a beneficial outcome, a predic-tion that was then violated by the occurrence of footshock outcomes of the same amplitude. Optically increasing release at successful avoidance made animals less sensitive to price (i.e., ␣ decreased), consistent with a positive reward prediction error. We infer that heightened release at successful avoidance signaled the outcome was better than predicted, indicating a good value worth seeking. Our data build on the notion that transient dopamine release events can represent subjective value (Schelp et al., 2017;Schultz et al., 2017) and further clarify that these value signals not only represent the value of pursuing reward, but also the value of avoiding harm.
The mesocorticolimbic pathway originates from dopamine neurons in the VTA and projects to motivational circuitry throughout the brain, most prominently the NAcc of the basal ganglia. The NAcc is thought to integrate transient dopamine signals with an array of converging neural input to generate goal directed actions via the Figure 11. Histology, locomotor assessment, and optical ferrule retention. At the end of optogenetic experimentation, rats were deeply anesthetized with a 50:50 ketamine:xylazine solution and transcardially perfused. Neural tissue was harvested, sliced coronally into 50-m slices and stained for TH and DAPI. Expression of ChR2 in VTA (A) and NAcc (B) was verified in all TH-Cre animals. C, Light retention of the optical ferrule cannula was assessed both before and on completion of the experiment. A t test revealed no significant difference in ferrule retention. D, To assess for potential locomotor confounds associated with optogenetic stimulation of VTA dopamine neuron cell bodies, rats were first placed in open field chambers and allowed to acclimate for 1 h before moving into either a 1-h stimulation paradigm wherein they received optical stimulation every 30 s, mirroring the max frequency of stimulation animals may receive in the economic task. Baseline and stimulation conditions were counterbalanced across eight animals and did not significantly alter locomotion. E, On completion of the voltammetric experiments, rats were euthanized with CO 2 , and using micromanipulators identical to those used for the voltammetry recordings, stainless steel electrodes were lowered to the same depth as the working electrode during the voltammetry recording. A current to both the stainless-steel electrode and the reference electrode was applied for ϳ40 s to lesion the neural tissue at the recording site. Tissue was coronally sectioned into 50-m slices and placement of the electrodes within the NAcc was assessed. Lesion locations are indicated in red. basal ganglia (Haber, 2014;Jin et al., 2014;Floresco, 2015;Graybiel and Grafton, 2015). Both the pursuit of reward (positive reinforcement) and the active avoidance of harm (negative reinforcement) require an increase in action to ultimately promote behavioral fitness and survival. While it is likely that individual dopamine neurons heterogeneously represent rewarding and aversive stimuli (Lammel et al., 2014;Pignatelli and Bonci, 2015), our data suggest that the summation of dopamine neural output is increased in the NAcc during active avoidance. Overall, we believe that these transient release events act on the basal ganglia to strengthen action sequences directed toward optimal outcomes. However, it is important to note that optical stimulation of the VTA dopamine neuron cell bodies leads to increases in concentration in various neural substrates implicated in avoidance (Darvas et al., 2011;Pignatelli and Bonci, 2015). Future studies are needed to investigate the specific role that dopamine plays in the valuation of negative reinforcement in various regions including the frontal cortex, amygdala, and striatum.
It is important to note that there are distinct forms of avoidance and dopamine may play unique roles in each. For example, if transient dopamine release events are significant for action generation, then dopamine release should be suppressed in passive avoidance, a situation where animals must inhibit action to avoid a negative outcome (Ögren and Stiedl, 2010). Unsignaled "Sidman" avoidance is another interesting consideration. In Sidman avoidance animals show active avoidance despite the absence of an exteroceptive warning signal (Sidman, 1953). It is possible that during unsignaled avoidance the lever itself, in addition to proprioceptive associations that develop during the lever response, function as conditioned stimuli. Another possibility is that dopamine may guide unsignaled avoidance through representation of interoceptive timing cues (Oleson et al., 2014;van Rijn et al., 2014). Indeed, as previously noted (Fig. 4B,D), we observed an increase in dopamine concentration preceding the cue, implying an anticipation of cue presentation as animals perform within this task. While additional studies are required to fully understand how dopamine is related to the various forms of avoidance, it is important to note that distinct patterns of dopamine release likely distinguish them.
In contrast to active avoidance, transient dopamine release is suppressed by unavoidable aversive events and their associative stimuli. Using FSCV, the Roitman group has demonstrated that a variety of aversive stimuli transiently decrease accumbal dopamine concentration (Roitman et al., 2008;Ebner et al., 2010;McCutcheon et al., 2012;Fortin et al., 2016). Similarly, stimuli associated with inescapable electrical footshock suppress accumbal dopamine concentration within the NAc (Badrinarayan et al., 2012;Oleson et al., 2012). While these dopamine release events likely represent the summation of VTA dopamine neural activity, there is growing evidence to support heterogeneous representation of rewarding and aversive stimuli at the level of the single unit (Lammel et al., 2012;Pignatelli and Bonci, 2015). While the majority of electrophysiology studies report that dopamine neurons are in-hibited by aversive stimuli (Mileykovskiy and Morales, 2011;Tan et al., 2012), numerous reports of excitations exist as well (Brischoux et al., 2009;Matsumoto and Hikosaka, 2009). While some of these reports were likely confounded by the inclusion of non-dopamine neurons (Ungless and Grace, 2012), it is likely that many factors influence whether individual neurons are excited or inhibited by an aversive event, including the subpopulation of neurons surrounding the recording site (Lammel et al., 2012) and the behavioral context in which aversion is introduced (Matsumoto et al., 2016).
Dopamine responses in active avoidance further depend on the behavioral history of the subject and the behavioral context in which avoidance is assessed. Notably, our observations here are discrepant from a previous FSCV characterization of dopamine in signaled active avoidance (Oleson et al., 2012), which observed distinct responses at the avoidance predictive cue during the escape outcome. In the present study, dopamine scaled with price at avoidance predictive cues, even in escape. Conversely, Oleson et al. (2012) reported that dopamine was inhibited at the avoidance predictive cues preceding escape. We believe that this discrepancy is due to a combination of behavioral history and behavioral context. While Oleson et al. (2012) observed a suppression in dopamine concentration and behavior when the avoidance predictive cue resulted in escape rather than avoidance, animals were simply trained to avoid footshock under an FR1 schedule in up to 50% of trials and sessions terminated after a fixed amount of time. In the current task, animals were extensively trained to avoid in several different paradigms (see methods) before being tested in the much more strenuous behavioral economics task. Furthermore, in the current task, sessions only terminate after 15 shocks or 20 escapes occur in a consecutive order. Thus, we believe that this initial attenuation of both behavior and dopamine release is induced by session onset, which comes to predict a negative session end. This powerful negative prediction develops and stabilizes over the course of economic avoidance training (Fig. 3D,E), supporting the notion that it is a learned association and distinct from changes in performance with optical stimulation. Once the animals had established stable baseline performance in the behavioral economics task, the negative prediction represented by session onset was sufficient to attenuate dopamine release and active avoidance. Optically augmenting dopamine release at avoidance rectified this behavioral attenuation, further supporting recent evidence that heightened dopamine release can overcome the effects of negative affect on behavior (Chaudhury et al., 2013;Tye et al., 2013).
Dopamine responses at the avoidance predictive cue depend on the behavioral outcome (avoidance vs escape). We observed distinct responses to avoidance associative cues depending on whether animals avoided or escaped footshock. During successful avoidance, cueevoked dopamine concentration scaled with changing price, resulting from both an increase in effort and opportunity costs, and corresponded to the avoidance out-come. During escape outcomes, cue-evoked dopamine concentration scaled with price irrespective of behavior. We interpret the distinct responses at the avoidance predictive cue to suggest that dopamine is continually representing avoidance value but is being simultaneously attenuated with behavior at session onset. We surmise this concurrent attenuation at session onset results from afferent inhibitory input, possibly from the lateral habenula via the rostromedial tegmentum .
Dopamine responses during the safety period also depend on the behavioral outcome. We observed dopamine concentration scaled with price during the signaled safety period when animals successfully avoided footshock, but not following escape. The distinct dopamine responses observed during the safety period may be relevant in the context of avoidance learning. Successfully avoiding an aversive event was the optimal outcome under our experimental conditions. Thus, dopamine might exclusively convey value during the safety period after optimal outcomes, thereby promoting their recurrence. While entering safety after escape is a beneficial outcome as well, dopamine may fail to represent the value of this less optimal outcome. Rather, we reason that dopamine generally represents the safety signal proceeding escape as a positive relief from pain (Navratilova and Porreca, 2014).
Avoidance outcomes produce unique behavioral and neurochemical responses when compared to rewarding outcomes (Schelp et al., 2017), suggesting that avoidance is more than simply reward redux. While the avoidance of an aversive event is rewarding, the avoidance outcome still carries aversive qualities. When animals run toward a goal box containing a footshock-associated rewarding outcome they show an approach-avoidance conflict, wherein retreat from the goal box accompanies the pursuit of reward (Geist and Ettenberg, 1997). In addition, unique demand profiles and dopamine responses are observed during the valuation of avoidance and reward (Roitman et al., 2008). The aversive attributes of footshock avoidance likely recruit additional neural circuits that interact with the mesocortiolimbic pathway during negative reinforcement. Thus, while mesocorticolimbic dopamine release events similarly represent the value of both rewarding and avoidance outcomes, we surmise that the overall neural circuitry underlying positive and negative reinforcement is distinct.
In conclusion, we demonstrate that accumbal transient dopamine release events scale proportionally to the value of avoidance outcomes and capably modify the valuation of active avoidance. Our data refute the notion that mesocorticolimbic dopamine is exclusively involved in positive reinforcement (Fiorillo 2013) and, rather, indicate that dopamine represents the value of all advantageous outcomes, including the avoidance of harm.