Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Prolonged dopamine signalling in striatum signals proximity and value of distant rewards

Abstract

Predictions about future rewarding events have a powerful influence on behaviour. The phasic spike activity of dopamine-containing neurons, and corresponding dopamine transients in the striatum, are thought to underlie these predictions, encoding positive and negative reward prediction errors1,2,3,4,5. However, many behaviours are directed towards distant goals, for which transient signals may fail to provide sustained drive. Here we report an extended mode of reward-predictive dopamine signalling in the striatum that emerged as rats moved towards distant goals. These dopamine signals, which were detected with fast-scan cyclic voltammetry (FSCV), gradually increased or—in rare instances—decreased as the animals navigated mazes to reach remote rewards, rather than having phasic or steady tonic profiles. These dopamine increases (ramps) scaled flexibly with both the distance and size of the rewards. During learning, these dopamine signals showed spatial preferences for goals in different locations and readily changed in magnitude to reflect changing values of the distant rewards. Such prolonged dopamine signalling could provide sustained motivational drive, a control mechanism that may be important for normal behaviour and that can be impaired in a range of neurologic and neuropsychiatric disorders.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Ramping striatal dopamine signals occur during maze runs.
Figure 2: Ramping dopamine signals proximity to distant rewards.
Figure 3: Dopamine ramping is sensitive to reward magnitude.
Figure 4: Ramping dopamine selectivity can emerge with training without experimentally imposed reward discrepancies.

Similar content being viewed by others

References

  1. Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012)

    Article  ADS  CAS  Google Scholar 

  2. Matsumoto, M. & Hikosaka, O. Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459, 837–841 (2009)

    Article  ADS  CAS  Google Scholar 

  3. Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)

    Article  CAS  Google Scholar 

  4. Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001)

    Article  ADS  CAS  Google Scholar 

  5. Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005)

    Article  CAS  Google Scholar 

  6. Schultz, W. Getting formal with dopamine and reward. Neuron 36, 241–263 (2002)

    Article  CAS  Google Scholar 

  7. Redgrave, P. & Gurney, K. The short-latency dopamine signal: a role in discovering novel actions? Nature Rev. Neurosci. 7, 967–975 (2006)

    Article  CAS  Google Scholar 

  8. Zweifel, L. S. et al. Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior. Proc. Natl Acad. Sci. USA 106, 7281–7288 (2009)

    Article  ADS  CAS  Google Scholar 

  9. Wang, L. P. et al. NMDA receptors in dopaminergic neurons are crucial for habit learning. Neuron 72, 1055–1066 (2011)

    Article  CAS  Google Scholar 

  10. Day, J. J., Jones, J. L., Wightman, R. M. & Carelli, R. M. Phasic nucleus accumbens dopamine release encodes effort- and delay-related costs. Biol. Psychiatry 68, 306–309 (2010)

    Article  CAS  Google Scholar 

  11. Day, J. J., Roitman, M. F., Wightman, R. M. & Carelli, R. M. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nature Neurosci. 10, 1020–1028 (2007)

    Article  CAS  Google Scholar 

  12. Gan, J. O., Walton, M. E. & Phillips, P. E. Dissociable cost and benefit encoding of future rewards by mesolimbic dopamine. Nature Neurosci. 13, 25–27 (2010)

    Article  CAS  Google Scholar 

  13. Phillips, P. E., Stuber, G. D., Heien, M. L., Wightman, R. M. & Carelli, R. M. Subsecond dopamine release promotes cocaine seeking. Nature 422, 614–618 (2003)

    Article  ADS  CAS  Google Scholar 

  14. Braun, A. A., Graham, D. L., Schaefer, T. L., Vorhees, C. V. & Williams, M. T. Dorsal striatal dopamine depletion impairs both allocentric and egocentric navigation in rats. Neurobiol. Learn. Mem. 97, 402–408 (2012)

    Article  CAS  Google Scholar 

  15. Salamone, J. D., Correa, M., Farrar, A. & Mingote, S. M. Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits. Psychopharmacology (Berl.) 191, 461–482 (2007)

    Article  CAS  Google Scholar 

  16. Whishaw, I. Q. & Dunnett, S. B. Dopamine depletion, stimulation or blockade in the rat disrupts spatial navigation and locomotion dependent upon beacon or distal cues. Behav. Brain Res. 18, 11–29 (1985)

    Article  CAS  Google Scholar 

  17. Salamone, J. D. & Correa, M. The mysterious motivational functions of mesolimbic dopamine. Neuron 76, 470–485 (2012)

    Article  CAS  Google Scholar 

  18. Niv, Y., Daw, N. D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl.) 191, 507–520 (2007)

    Article  CAS  Google Scholar 

  19. Barnes, T. D., Kubota, Y., Hu, D., Jin, D. Z. & Graybiel, A. M. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature 437, 1158–1161 (2005)

    Article  ADS  CAS  Google Scholar 

  20. Niv, Y., Duff, M. O. & Dayan, P. Dopamine, uncertainty and TD learning. Behav. Brain Funct. 1, 6 (2005)

    Article  Google Scholar 

  21. Tobler, P. N., Fiorillo, C. D. & Schultz, W. Adaptive coding of reward value by dopamine neurons. Science 307, 1642–1645 (2005)

    Article  ADS  CAS  Google Scholar 

  22. Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003)

    Article  ADS  CAS  Google Scholar 

  23. Hikosaka, O., Sakamoto, M. & Usui, S. Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. J. Neurophysiol. 61, 814–832 (1989)

    Article  CAS  Google Scholar 

  24. Derdikman, D. & Moser, E. I. A manifold of spatial maps in the brain. Trends Cogn. Sci. 14, 561–569 (2010)

    Article  Google Scholar 

  25. Flagel, S. B. et al. A selective role for dopamine in stimulus-reward learning. Nature 469, 53–57 (2011)

    Article  ADS  CAS  Google Scholar 

  26. Berridge, K. C. The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology (Berl.) 191, 391–431 (2007)

    Article  CAS  Google Scholar 

  27. Clark, J. J. et al. Chronic microsensors for longitudinal, subsecond dopamine detection in behaving animals. Nature Methods 7, 126–129 (2010)

    Article  CAS  Google Scholar 

  28. Keithley, R. B., Heien, M. L. & Wightman, R. M. Multivariate concentration determination using principal component regression with residual analysis. Trends Analyt. Chem. 28, 1127–1136 (2009)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank G. Telian, L. Li, T. Lechina, S. Ng-Evans and N. Hollon for help, and M. J. Kim, K. S. Smith, T. J. Sejnowski, L. G. Gibb and Y. Kubota for their comments. This work was supported by US National Institutes of Health (NIH) grant R01 MH060379 (A.M.G.), National Parkinson Foundation (A.M.G.), CHDI Foundation grant A-5552, the Stanley H. and Sheila G. Sydney Fund (A.M.G.), a Mark Gorenberg fellowship (M.W.H.), and NIH grants R01 DA027858 and R01 MH079292 (P.E.M.P.).

Author information

Authors and Affiliations

Authors

Contributions

M.W.H. and A.M.G. designed the experiments, analysed the data, and wrote the manuscript. M.W.H. conducted the experiments. P.L.T. helped to develop the FSCV chronic recording setup. S.G.S. and P.E.M.P provided technical training on FSCV and advice on data analysis. All authors contributed to discussion and interpretation of the findings.

Corresponding author

Correspondence to Ann M. Graybiel.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Schematic drawing of recording locations.

Coloured dots indicate positions of probe tips as determined by post-experiment probe length measurements and histology (see Methods). Each colour represents probes from one rat.

Extended Data Figure 2 Comparisons of dopamine release in the VMS measured in behaving rats and evoked by electrical MFB stimulation in anaesthetized rats.

a, Current measured at each potential during successive scans with electrical MFB stimulation (red line) in an anaesthetized rat. b, Current–voltage plot from the 5 scans during the 0.5-s period after MFB stimulation. The large increase in current around 0.6 V (dotted line, peak) corresponds to the dopamine redox potential measured in vitro. c, Time course of current at the dopamine redox potential around electrical MFB stimulation (red line). d–f, Plots, as in a–c, illustrating dopamine response in the same rat, now awake, to room lights being turned on. g, Current changes averaged over all trials (n = 4,418 trials) in which ramping occurred (see Methods) during T-maze running. h, Average current–voltage plot for all identified ramping trials for the time period (−3 to −2 s) indicated by the brackets in the colour plot in g. i, Average current changes induced by tonic MFB stimulation in anaesthetized rats (n = 3). j, Average current–voltage plot from the bracketed time range (10 to 11 s) in i following the onset of stimulation.

Extended Data Figure 3 Patterns of dopamine signals in the VMS and DLS.

a, b, Dopamine concentration recorded in the VMS (a, n = 9 rats) and in the DLS (b, n = 8 rats). Data were first averaged across trials to yield session average traces for each probe in each session. These traces were averaged within rat to obtain one average trace per rat, which were then averaged across rats. These plots differ from those in Fig. 1e, f, which considered session averages for each probe to be an independent measure. Shading represents s.e.m across rats. c, d, Distribution of average peak dopamine values for all recordings in VMS (c) and DLS (d). Each colour corresponds to an average peak dopamine concentration measured by a single probe in different sessions. e, f, Proportion of trial averaged dopamine recordings in the VMS (e, out of 300) and DLS (f, out of 262) that displayed a positive (blue) or negative (grey) ramping response during maze running, and an unclassified dopamine profile (red). g, h, Average dopamine concentration in the VMS (g) and DLS (h) for the positive ramping traces (top), the negative ramping traces (middle), and the unclassified traces (bottom). Shading represents s.e.m.

Extended Data Figure 4 Transient responses to cues and rewards occur during the T-maze task performance and are distinct from the sustained ramping response.

a, Dopamine concentration in a representative trial that included both a phasic response to warning click and a sustained ramping response to goal-reaching. b, c, Average normalized dopamine traces from VMS (b) and DLS (c) probes from all trials (n = 890 and 640, respectively) that showed identified transients after warning click (see Methods). In the VMS, note the sharp increase in dopamine around warning click superimposed on the ramping response that followed the phasic click response. Shading represents s.e.m. calculated across trials. d, Model for dopamine release profiles in the T-maze task. Sharper transient responses are present at the start of maze running (red) and after goal-reaching (cyan). These responses can be superimposed on and modulated independently of the slower ramping signal related to goal proximity (dark blue).

Extended Data Figure 5 Dopamine signals in response to free reward outside of the task.

a, Average peak dopamine to unexpected chocolate milk delivery outside the task is positively correlated with peak ramping dopamine measured from the same probes during preceding behavioural training in the maze (n = 146 sessions; Pearson’s R = 0.45, P < 0.0001). b, Average peak dopamine concentration induced by unexpected free reward outside the task (blue) and peak amplitude of dopamine ramping during maze performance just before reward (red; paired t-test, P < 0.001).

Extended Data Figure 6 Peak ramping dopamine values do not vary as a function of trial time.

a, Dopamine release modelled as a summation of four weighted transients in response to fixed maze events on short trials (purple) and long trials (orange). The overlap of the transients is reduced on the long trials, resulting in a lower peak dopamine level at the end of the maze run. Thick lines indicate overall average dopamine, and thin lines indicate the averages of each of the 4 transients across 100 simulation runs. b, Relative predicted peak dopamine levels on short and long trials calculated as a linear decay function of trial duration for the simulated model shown in a (black), for the spatial proximity model (light blue), and for the actual experimental data (dark blue). c, An alternative multi-transient model in which the transients (3, inset) are heavily weighted towards the goal location, are highly variable in their time of occurrence, and display a long decay time-course. In this model, the difference between short and long trials is within the noise range of the data. The average of individual transients (inset) across multiple simulations is a smeared version of the single transients that is weighted towards the goal location. d, Data plotted as in b, for the alternative multi-transient model. e, f, Average run speed (e) and acceleration (f) during short (purple) and long (orange) trials, as shown in Fig. 2, for all animals and sessions.

Extended Data Figure 7 Examples of trials with pausing or exploration near the choice point of the T-maze.

a–c, Video tracker traces (a), relative proximity to reward (b), and dopamine concentration (c) measured during a single trial in which a rat paused near the choice point. d–f, Video tracker traces (d), relative proximity to reward (e) and dopamine concentration (f) measured during another trial from a different rat.

Extended Data Figure 8 Value manipulation experiments in the T-maze, M-maze, and S-maze for all rats.

a–c, Average peak dopamine levels for M-maze sessions for three individual rats (M31, M36, and M47) in the left (blue) and right (red) end-arms. Blue and red shading indicates sessions in which left and right arm contained the larger reward, respectively. Error bars indicate s.e.m. d, e, Average peak dopamine levels, as in a–c, for value bias T-maze sessions for two rats (M36 and M47). f, g, Average peak dopamine levels, as in a–c, for value bias S-maze sessions for 2 rats (M47 and M35). h–j, Average normalized dopamine levels measured in the high reward arm (light green) and low reward arm (dark green) as rats performed the M-maze (h), T-maze (i) and S-maze (j) tasks. k, l, Dopamine concentration relative to left (k) and right (l) goal-reaching during the first session following a reversal of reward values (session 17 of M47, indicated by an asterisk in c).

Extended Data Figure 9 Dopamine signals recorded in two rats trained on the S-maze task.

a, Average normalized dopamine concentrations measured from VMS probes in one rat (n = 5 sessions) performing the S-maze task as in Fig. 3j. Light green line indicates runs to the higher reward goal, and dark green lines to the lower reward goal. Shading indicates s.e.m. Red vertical lines indicate turns. b, Traces, as in a, for the second rat trained on the S-maze task (n = 4 sessions).

Extended Data Figure 10 Dopamine end-arm preferences become positively correlated with arm choices with training, but ramping dopamine signals do not change significantly with performance improvement.

a, b, Distribution of selectivity indices, as in Fig. 4d, for all probes implanted in the left (a, n = 5 rats) and right (b, n = 3 rats) hemispheres. Note the bias in both groups of the selectivity preference towards negative selectivity indices (right bias, red) relative to the shuffled data (blue). c–e, Biases in average run time (c), percentage of correct responses (d), and arm choices (e) across training blocks. Negative values indicate biases towards the right end-arm. f, Raw average dopamine selectivity indices across training blocks. Note emergence of right bias with training. g, Correlation coefficients (Pearson’s R) computed for each training block between arm choice selectivity indices and dopamine selectivity indices. Error bars indicate confidence limits of the correlations. h, Normalized peak magnitudes of dopamine signals averaged in a 0.5-s window before goal-reaching in sessions with significant (Mann–Whitney U-test, P < 0.05) pre-goal increases. Data are averaged across rats for each training block (for which, left to right, n = 48, 101, 113 and 179 trial averaged recordings, respectively). i, Average (± s.e.m.) dopamine concentration from ramping dopamine sessions in which percentage of correct trials fell above (red, n = 179) or below (blue, n = 92) the learning criterion for T-maze task acquisition (72.5% correct, chi-square test, P < 0.05). j, Average (± s.e.m.) peak dopamine levels from the sessions plotted in i, showing no significant difference between pre- and post-learning periods (t-test, P = 0.44).

Supplementary information

Supplementary Information

This file contains a Supplementary Discussion and additional references. (PDF 265 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Cite this article

Howe, M., Tierney, P., Sandberg, S. et al. Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500, 575–579 (2013). https://doi.org/10.1038/nature12475

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature12475

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing