Elsevier

Neurobiology of Learning and Memory

Volume 125, November 2015, Pages 135-145
Neurobiology of Learning and Memory

Phasic dopamine release induced by positive feedback predicts individual differences in reversal learning

https://doi.org/10.1016/j.nlm.2015.08.011Get rights and content

Highlights

  • We monitored rapid changes in dopamine during spatial reversal learning.

  • After reversal, cue-evoked dopamine was updated following positive feedback.

  • This updating was only observed in animals that learned the reversal.

  • The positive feedback signal might facilitate behavioral adaptation after reversal.

Abstract

Striatal dopamine (DA) is central to reward-based learning. Less is known about the contribution of DA to the ability to adapt previously learned behavior in response to changes in the environment, such as a reversal of response-reward contingencies. We hypothesized that DA is involved in the rapid updating of response-reward information essential for successful reversal learning. We trained rats to discriminate between two levers, where lever availability was signaled by a non-discriminative cue. Pressing one lever was always rewarded, whereas the other lever was never rewarded. After reaching stable discrimination performance, a reversal was presented, so that the previously non-rewarded lever was now rewarded and vice versa. We used fast-scan cyclic voltammetry to monitor DA release in the ventromedial striatum. During discrimination performance (pre-reversal), cue presentation induced phasic DA release, whereas reward delivery did not. The opposite pattern was observed post-reversal: Striatal DA release emerged after reward delivery, while cue-induced release diminished. Trial-by-trial analysis showed rapid reinstatement of cue-induced DA release on trials immediately following initial correct responses. This effect of positive feedback was observed in animals that learned the reversal, but not in ‘non-learners’. In contrast, neither pre-reversal responding and DA signaling, nor post-reversal DA signaling in response to negative feedback differed between learners and non-learners. Together, we show that phasic DA dynamics in the ventromedial striatum encoding reward-predicting cues are associated with positive feedback during reversal learning. Furthermore, these signals predict individual differences in learning that are not present prior to reversal, suggesting a distinct role for dopamine in the adaptation of previously learned behavior.

Introduction

Throughout the day, we perform numerous behavioral actions to pursue things we desire or to prevent adverse events. In a constantly changing environment, this behavior has to be adaptive and flexible. One form of adaptive behavior is reversal learning, which requires the ability to use negative feedback to inhibit a learned response that was previously rewarded and at the same time to use positive feedback to switch to a response that was previously unrewarded. The ability to adapt goal-directed behavior to changes in the environment depends on frontal and striatal regions, as demonstrated in rodents (Castane et al., 2010, McAlonan and Brown, 2003), non-human primates (Clarke et al., 2008, Dias et al., 1996) and humans (Bellebaum, Koch, Schwarz, & Daum, 2008). Indeed, compromised frontostriatal integrity results in deficient adaptive behavior and cognitive dysfunctions in various neurological and psychiatric disorders, such as obsessive–compulsive disorder (OCD), drug addiction and Parkinson’s disease (Ceaser et al., 2008, Chamberlain et al., 2006, Cools et al., 2001, Millan et al., 2012, Verdejo-Garcia et al., 2006, Yerys et al., 2009).

Dopamine (DA) is an important neuromodulator in frontostriatal networks. Striatal DA release facilitates reward learning and mediates approach behavior towards rewards. Specifically, DA neurons increase firing in response to unexpected rewards and reward-predicting stimuli and decrease firing when an expected reward is omitted (Pan et al., 2005, Schultz et al., 1997). These and other findings suggest that DA neurons encode a ‘reward-prediction error’ that serves as a teaching signal to guide behavior (Montague et al., 1996, Schultz et al., 1997, Steinberg et al., 2013, Waelti et al., 2001). It is known that burst firing of DA neurons facilitates initial learning of response-reward associations (Zweifel et al., 2009). Consistently, selective optogenetic activation of DA neurons affects operant responding in a similar manner as positive feedback does (Kim et al., 2012, Witten et al., 2011), and pharmacological and genetic manipulations demonstrate DA-mediated regulation of adaptive behavior (Clarke et al., 2011, Haluk and Floresco, 2009, Klanker et al., 2013, Laughlin et al., 2011). DA may not only play such role during initial learning, but also in adaptation of established operant responding. Indeed, optogenetic activation of DA neurons (simulating positive feedback) can also mediate reversal of reward-seeking behavior (Adamantidis et al., 2011). However, theories of reinforcement learning and models of DA-mediated prediction errors suggest that fluctuations in striatal DA concentration mediate the effects of both positive and negative feedback on learning (Frank and Claus, 2006, Hong and Hikosaka, 2011). It remains to be shown whether DA mediates the effects of negative feedback during behavioral adaption following an unexpected switch in reward contingencies and whether striatal DA reflects the receipt of feedback on a trial-by-trial basis. Therefore, we used fast-scan cyclic voltammetry (FSCV) in behaving animals (Millar, Stamford, Kruk, & Wightman, 1985) to monitor rapid changes in DA release in the ventromedial striatum during an operant spatial reversal learning task, in which rats had to adapt goal-directed behavior following a reversal of response-reward contingencies. Our results show that phasic cue-evoked DA signals were promptly updated following positive, but not negative feedback. Furthermore, we observed individual differences in the extent to which the receipt of positive feedback updated the reward-predicting DA signal on subsequent trials, where DA signaling predicted successful reversal learning. Thus, our findings suggest that phasic DA in the ventromedial striatum provides a positive feedback signal to facilitate adaptation of previously learned behavior following a behavioral switch.

Section snippets

Animals

Male Wistar rats (Charles River) were housed in a controlled environment under a reversed day-night schedule (white lights: 7 p.m.–7 a.m.). Rats were food-restricted (16 g animal/day), with unlimited access to water during behavioral training. All experiments were approved by the Animal Experimentation Committee of the Royal Netherlands Academy of Arts and Sciences and were carried out in agreement with Dutch laws (Wet op de Dierproeven, 1996) and European regulations (Guideline 86/609/EEC).

Surgery

Rats

Histology

Fig. 1 illustrates the recording sites in the ventromedial striatum (dorsomedial accumbens core and ventromedial regions caudate putamen in right hemisphere). Final group sizes for animals included in FSCV data: first discrimination session n = 11, second discrimination n = 8, reversal session n = 21. A statistical comparison of DA responses at the different locations used in the reversal session revealed no significant differences.

Phasic DA changes in the ventromedial striatum during spatial discrimination learning

After learning to lever-press for reward in several shaping sessions,

Discussion

Successful adaptation of behavior following reversal requires the ability to use a change in reinforcing feedback. To investigate whether phasic DA release in the ventromedial striatum contributes to such an adaptation, we recorded DA release in rats during a spatial discrimination and reversal task in which a non-discriminative cue signaled trial onset. During successful responding in the discrimination phase (prior to reversal), DA was evoked by the cue, but not the reward. However, during

Acknowledgments

We wish to thank Ralph Hamelink and the NIN mechatronics department for technical assistance. We greatly appreciate the support offered by Drs Mark Wightman, Michael Heien, Garret Stuber, Paul Phillips and Scott Ng-Evans with the introduction of fast-scan cyclic voltammetry in our institute. This introduction was supported by two grants from the graduate school Neurowetenschappen Amsterdam.

References (60)

  • G.D. Stuber et al.

    Extinction of cocaine self-administration reveals functionally and temporally distinct dopaminergic signals in the nucleus accumbens

    Neuron

    (2005)
  • J.A. Sugam et al.

    Phasic nucleus accumbens dopamine encodes risk-based decision-making behavior

    Biological Psychiatry

    (2012)
  • P. Voorn et al.

    Putting a spin on the dorsal-ventral divide of the striatum

    Trends in Neurosciences

    (2004)
  • K.M. Wassum et al.

    Phasic mesolimbic dopamine signaling precedes and predicts performance of a self-initiated action sequence task

    Biological Psychiatry

    (2012)
  • I.B. Witten et al.

    Recombinase-driver rat lines: Tools, techniques, and optogenetic application to dopamine-mediated reinforcement

    Neuron

    (2011)
  • D.S. Zahm et al.

    On the significance of subterritories in the “accumbens” part of the rat ventral striatum

    Neuroscience

    (1992)
  • A.R. Adamantidis et al.

    Optogenetic interrogation of dopaminergic modulation of the multiple phases of reward-seeking behavior

    Journal of Neuroscience

    (2011)
  • C. Bellebaum et al.

    Focal basal ganglia lesions are associated with impairments in reward-based reversal learning

    Brain

    (2008)
  • S.R. Chamberlain et al.

    Motor inhibition and cognitive flexibility in obsessive-compulsive disorder and trichotillomania

    American Journal of Psychiatry

    (2006)
  • H.F. Clarke et al.

    Dopamine, but not serotonin, regulates reversal learning in the marmoset caudate nucleus

    Journal of Neuroscience

    (2011)
  • H.F. Clarke et al.

    Lesions of the medial striatum in monkeys produce perseverative impairments during reversal learning similar to those produced by lesions of the orbitofrontal cortex

    Journal of Neuroscience

    (2008)
  • R. Cools et al.

    Mechanisms of cognitive set flexibility in Parkinson’s disease

    Brain

    (2001)
  • R. Cools et al.

    Striatal dopamine predicts outcome-specific reversal learning and its sensitivity to dopaminergic drug administration

    Journal of Neuroscience

    (2009)
  • M. Darvas et al.

    Restricting dopaminergic signaling to either dorsolateral or medial striatum facilitates cognition

    Journal of Neuroscience

    (2010)
  • J.J. Day et al.

    Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens

    Nature Neuroscience

    (2007)
  • R. Dias et al.

    Dissociation in prefrontal cortex of affective and attentional shifts

    Nature

    (1996)
  • S.B. Flagel et al.

    A selective role for dopamine in stimulus-reward learning

    Nature

    (2011)
  • M.J. Frank et al.

    Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal

    Psychological Review

    (2006)
  • M.J. Frank et al.

    By carrot or by stick: Cognitive reinforcement learning in parkinsonism

    Science

    (2004)
  • C.R. Gallistel et al.

    The learning curve: Implications of a quantitative analysis

    Proceedings of the National Academy of Sciences of the United States of America

    (2004)
  • Cited by (0)

    View full text