Phasic dopamine release induced by positive feedback predicts individual differences in reversal learning
Introduction
Throughout the day, we perform numerous behavioral actions to pursue things we desire or to prevent adverse events. In a constantly changing environment, this behavior has to be adaptive and flexible. One form of adaptive behavior is reversal learning, which requires the ability to use negative feedback to inhibit a learned response that was previously rewarded and at the same time to use positive feedback to switch to a response that was previously unrewarded. The ability to adapt goal-directed behavior to changes in the environment depends on frontal and striatal regions, as demonstrated in rodents (Castane et al., 2010, McAlonan and Brown, 2003), non-human primates (Clarke et al., 2008, Dias et al., 1996) and humans (Bellebaum, Koch, Schwarz, & Daum, 2008). Indeed, compromised frontostriatal integrity results in deficient adaptive behavior and cognitive dysfunctions in various neurological and psychiatric disorders, such as obsessive–compulsive disorder (OCD), drug addiction and Parkinson’s disease (Ceaser et al., 2008, Chamberlain et al., 2006, Cools et al., 2001, Millan et al., 2012, Verdejo-Garcia et al., 2006, Yerys et al., 2009).
Dopamine (DA) is an important neuromodulator in frontostriatal networks. Striatal DA release facilitates reward learning and mediates approach behavior towards rewards. Specifically, DA neurons increase firing in response to unexpected rewards and reward-predicting stimuli and decrease firing when an expected reward is omitted (Pan et al., 2005, Schultz et al., 1997). These and other findings suggest that DA neurons encode a ‘reward-prediction error’ that serves as a teaching signal to guide behavior (Montague et al., 1996, Schultz et al., 1997, Steinberg et al., 2013, Waelti et al., 2001). It is known that burst firing of DA neurons facilitates initial learning of response-reward associations (Zweifel et al., 2009). Consistently, selective optogenetic activation of DA neurons affects operant responding in a similar manner as positive feedback does (Kim et al., 2012, Witten et al., 2011), and pharmacological and genetic manipulations demonstrate DA-mediated regulation of adaptive behavior (Clarke et al., 2011, Haluk and Floresco, 2009, Klanker et al., 2013, Laughlin et al., 2011). DA may not only play such role during initial learning, but also in adaptation of established operant responding. Indeed, optogenetic activation of DA neurons (simulating positive feedback) can also mediate reversal of reward-seeking behavior (Adamantidis et al., 2011). However, theories of reinforcement learning and models of DA-mediated prediction errors suggest that fluctuations in striatal DA concentration mediate the effects of both positive and negative feedback on learning (Frank and Claus, 2006, Hong and Hikosaka, 2011). It remains to be shown whether DA mediates the effects of negative feedback during behavioral adaption following an unexpected switch in reward contingencies and whether striatal DA reflects the receipt of feedback on a trial-by-trial basis. Therefore, we used fast-scan cyclic voltammetry (FSCV) in behaving animals (Millar, Stamford, Kruk, & Wightman, 1985) to monitor rapid changes in DA release in the ventromedial striatum during an operant spatial reversal learning task, in which rats had to adapt goal-directed behavior following a reversal of response-reward contingencies. Our results show that phasic cue-evoked DA signals were promptly updated following positive, but not negative feedback. Furthermore, we observed individual differences in the extent to which the receipt of positive feedback updated the reward-predicting DA signal on subsequent trials, where DA signaling predicted successful reversal learning. Thus, our findings suggest that phasic DA in the ventromedial striatum provides a positive feedback signal to facilitate adaptation of previously learned behavior following a behavioral switch.
Section snippets
Animals
Male Wistar rats (Charles River) were housed in a controlled environment under a reversed day-night schedule (white lights: 7 p.m.–7 a.m.). Rats were food-restricted (16 g animal/day), with unlimited access to water during behavioral training. All experiments were approved by the Animal Experimentation Committee of the Royal Netherlands Academy of Arts and Sciences and were carried out in agreement with Dutch laws (Wet op de Dierproeven, 1996) and European regulations (Guideline 86/609/EEC).
Surgery
Rats
Histology
Fig. 1 illustrates the recording sites in the ventromedial striatum (dorsomedial accumbens core and ventromedial regions caudate putamen in right hemisphere). Final group sizes for animals included in FSCV data: first discrimination session n = 11, second discrimination n = 8, reversal session n = 21. A statistical comparison of DA responses at the different locations used in the reversal session revealed no significant differences.
Phasic DA changes in the ventromedial striatum during spatial discrimination learning
After learning to lever-press for reward in several shaping sessions,
Discussion
Successful adaptation of behavior following reversal requires the ability to use a change in reinforcing feedback. To investigate whether phasic DA release in the ventromedial striatum contributes to such an adaptation, we recorded DA release in rats during a spatial discrimination and reversal task in which a non-discriminative cue signaled trial onset. During successful responding in the discrimination phase (prior to reversal), DA was evoked by the cue, but not the reward. However, during
Acknowledgments
We wish to thank Ralph Hamelink and the NIN mechatronics department for technical assistance. We greatly appreciate the support offered by Drs Mark Wightman, Michael Heien, Garret Stuber, Paul Phillips and Scott Ng-Evans with the introduction of fast-scan cyclic voltammetry in our institute. This introduction was supported by two grants from the graduate school Neurowetenschappen Amsterdam.
References (60)
- et al.
Midbrain dopamine neurons encode a quantitative reward prediction error signal
Neuron
(2005) - et al.
Dissecting components of reward: ‘Liking’, ‘wanting’, and learning
Current Opinion in Pharmacology
(2009) - et al.
Selective lesions of the dorsomedial striatum impair serial spatial reversal learning in rats
Behavioural Brain Research
(2010) - et al.
Set-shifting ability and schizophrenia: A marker of clinical illness or an intermediate phenotype?
Biological Psychiatry
(2008) - et al.
Contributions of striatal dopamine signaling to the modulation of cognitive flexibility
Biological Psychiatry
(2011) - et al.
Multivariate concentration determination using principal component regression with residual analysis
Trends in Analytical Chemistry
(2009) - et al.
Genetic dissection of behavioral flexibility: Reversal learning in mice
Biological Psychiatry
(2011) - et al.
Orbital prefrontal cortex mediates reversal learning and not attentional set shifting in the rat
Behavioural Brain Research
(2003) - et al.
Electrochemical, pharmacological and electrophysiological evidence of rapid dopamine release and removal in the rat caudate nucleus following electrical stimulation of the median forebrain bundle
European Journal of Pharmacology
(1985) - et al.
The effect of striatal dopamine depletion and the adenosine A2A antagonist KW-6002 on reversal learning in rats
Neurobiology of Learning and Memory
(2007)