Prolonged dopamine signalling in striatum signals proximity and value of distant rewards

Howe, Mark W.; Tierney, Patrick L.; Sandberg, Stefan G.; Phillips, Paul E. M.; Graybiel, Ann M.

doi:10.1038/nature12475

Letter
Published: 04 August 2013

Prolonged dopamine signalling in striatum signals proximity and value of distant rewards

Mark W. Howe¹,
Patrick L. Tierney¹,
Stefan G. Sandberg²,
Paul E. M. Phillips² &
…
Ann M. Graybiel¹

Nature volume 500, pages 575–579 (2013)Cite this article

29k Accesses
327 Citations
253 Altmetric
Metrics details

Subjects

Abstract

Predictions about future rewarding events have a powerful influence on behaviour. The phasic spike activity of dopamine-containing neurons, and corresponding dopamine transients in the striatum, are thought to underlie these predictions, encoding positive and negative reward prediction errors^1,2,3,4,5. However, many behaviours are directed towards distant goals, for which transient signals may fail to provide sustained drive. Here we report an extended mode of reward-predictive dopamine signalling in the striatum that emerged as rats moved towards distant goals. These dopamine signals, which were detected with fast-scan cyclic voltammetry (FSCV), gradually increased or—in rare instances—decreased as the animals navigated mazes to reach remote rewards, rather than having phasic or steady tonic profiles. These dopamine increases (ramps) scaled flexibly with both the distance and size of the rewards. During learning, these dopamine signals showed spatial preferences for goals in different locations and readily changed in magnitude to reflect changing values of the distant rewards. Such prolonged dopamine signalling could provide sustained motivational drive, a control mechanism that may be important for normal behaviour and that can be impaired in a range of neurologic and neuropsychiatric disorders.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Ramping striatal dopamine signals occur during maze runs.**

**Figure 2: Ramping dopamine signals proximity to distant rewards.**

**Figure 3: Dopamine ramping is sensitive to reward magnitude.**

**Figure 4: Ramping dopamine selectivity can emerge with training without experimentally imposed reward discrepancies.**

Dopamine transients follow a striatal gradient of reward time horizons

Article Open access 06 February 2024

Ali Mohebi, Wei Wei, … Joshua D. Berke

Dopamine transients do not act as model-free prediction errors during associative learning

Article Open access 08 January 2020

Melissa J. Sharpe, Hannah M. Batchelor, … Geoffrey Schoenbaum

Dissociable dopamine dynamics for learning and motivation

Article 22 May 2019

Ali Mohebi, Jeffrey R. Pettibone, … Joshua D. Berke

References

Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012)
Article ADS CAS Google Scholar
Matsumoto, M. & Hikosaka, O. Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459, 837–841 (2009)
Article ADS CAS Google Scholar
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)
Article CAS Google Scholar
Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001)
Article ADS CAS Google Scholar
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005)
Article CAS Google Scholar
Schultz, W. Getting formal with dopamine and reward. Neuron 36, 241–263 (2002)
Article CAS Google Scholar
Redgrave, P. & Gurney, K. The short-latency dopamine signal: a role in discovering novel actions? Nature Rev. Neurosci. 7, 967–975 (2006)
Article CAS Google Scholar
Zweifel, L. S. et al. Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior. Proc. Natl Acad. Sci. USA 106, 7281–7288 (2009)
Article ADS CAS Google Scholar
Wang, L. P. et al. NMDA receptors in dopaminergic neurons are crucial for habit learning. Neuron 72, 1055–1066 (2011)
Article CAS Google Scholar
Day, J. J., Jones, J. L., Wightman, R. M. & Carelli, R. M. Phasic nucleus accumbens dopamine release encodes effort- and delay-related costs. Biol. Psychiatry 68, 306–309 (2010)
Article CAS Google Scholar
Day, J. J., Roitman, M. F., Wightman, R. M. & Carelli, R. M. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nature Neurosci. 10, 1020–1028 (2007)
Article CAS Google Scholar
Gan, J. O., Walton, M. E. & Phillips, P. E. Dissociable cost and benefit encoding of future rewards by mesolimbic dopamine. Nature Neurosci. 13, 25–27 (2010)
Article CAS Google Scholar
Phillips, P. E., Stuber, G. D., Heien, M. L., Wightman, R. M. & Carelli, R. M. Subsecond dopamine release promotes cocaine seeking. Nature 422, 614–618 (2003)
Article ADS CAS Google Scholar
Braun, A. A., Graham, D. L., Schaefer, T. L., Vorhees, C. V. & Williams, M. T. Dorsal striatal dopamine depletion impairs both allocentric and egocentric navigation in rats. Neurobiol. Learn. Mem. 97, 402–408 (2012)
Article CAS Google Scholar
Salamone, J. D., Correa, M., Farrar, A. & Mingote, S. M. Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits. Psychopharmacology (Berl.) 191, 461–482 (2007)
Article CAS Google Scholar
Whishaw, I. Q. & Dunnett, S. B. Dopamine depletion, stimulation or blockade in the rat disrupts spatial navigation and locomotion dependent upon beacon or distal cues. Behav. Brain Res. 18, 11–29 (1985)
Article CAS Google Scholar
Salamone, J. D. & Correa, M. The mysterious motivational functions of mesolimbic dopamine. Neuron 76, 470–485 (2012)
Article CAS Google Scholar
Niv, Y., Daw, N. D., Joel, D. & Dayan, P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl.) 191, 507–520 (2007)
Article CAS Google Scholar
Barnes, T. D., Kubota, Y., Hu, D., Jin, D. Z. & Graybiel, A. M. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature 437, 1158–1161 (2005)
Article ADS CAS Google Scholar
Niv, Y., Duff, M. O. & Dayan, P. Dopamine, uncertainty and TD learning. Behav. Brain Funct. 1, 6 (2005)
Article Google Scholar
Tobler, P. N., Fiorillo, C. D. & Schultz, W. Adaptive coding of reward value by dopamine neurons. Science 307, 1642–1645 (2005)
Article ADS CAS Google Scholar
Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003)
Article ADS CAS Google Scholar
Hikosaka, O., Sakamoto, M. & Usui, S. Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. J. Neurophysiol. 61, 814–832 (1989)
Article CAS Google Scholar
Derdikman, D. & Moser, E. I. A manifold of spatial maps in the brain. Trends Cogn. Sci. 14, 561–569 (2010)
Article Google Scholar
Flagel, S. B. et al. A selective role for dopamine in stimulus-reward learning. Nature 469, 53–57 (2011)
Article ADS CAS Google Scholar
Berridge, K. C. The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology (Berl.) 191, 391–431 (2007)
Article CAS Google Scholar
Clark, J. J. et al. Chronic microsensors for longitudinal, subsecond dopamine detection in behaving animals. Nature Methods 7, 126–129 (2010)
Article CAS Google Scholar
Keithley, R. B., Heien, M. L. & Wightman, R. M. Multivariate concentration determination using principal component regression with residual analysis. Trends Analyt. Chem. 28, 1127–1136 (2009)
Article CAS Google Scholar

Download references

Acknowledgements

We thank G. Telian, L. Li, T. Lechina, S. Ng-Evans and N. Hollon for help, and M. J. Kim, K. S. Smith, T. J. Sejnowski, L. G. Gibb and Y. Kubota for their comments. This work was supported by US National Institutes of Health (NIH) grant R01 MH060379 (A.M.G.), National Parkinson Foundation (A.M.G.), CHDI Foundation grant A-5552, the Stanley H. and Sheila G. Sydney Fund (A.M.G.), a Mark Gorenberg fellowship (M.W.H.), and NIH grants R01 DA027858 and R01 MH079292 (P.E.M.P.).

Author information

Authors and Affiliations

McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, 02139, Massachusetts, USA
Mark W. Howe, Patrick L. Tierney & Ann M. Graybiel
Department of Psychiatry and Behavioral Sciences, and Department of Pharmacology, University of Washington, Seattle, 98195, Washington, USA
Stefan G. Sandberg & Paul E. M. Phillips

Authors

Mark W. Howe
View author publications
You can also search for this author in PubMed Google Scholar
Patrick L. Tierney
View author publications
You can also search for this author in PubMed Google Scholar
Stefan G. Sandberg
View author publications
You can also search for this author in PubMed Google Scholar
Paul E. M. Phillips
View author publications
You can also search for this author in PubMed Google Scholar
Ann M. Graybiel
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.W.H. and A.M.G. designed the experiments, analysed the data, and wrote the manuscript. M.W.H. conducted the experiments. P.L.T. helped to develop the FSCV chronic recording setup. S.G.S. and P.E.M.P provided technical training on FSCV and advice on data analysis. All authors contributed to discussion and interpretation of the findings.

Corresponding author

Correspondence to Ann M. Graybiel.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Schematic drawing of recording locations.

Coloured dots indicate positions of probe tips as determined by post-experiment probe length measurements and histology (see Methods). Each colour represents probes from one rat.

Extended Data Figure 2 Comparisons of dopamine release in the VMS measured in behaving rats and evoked by electrical MFB stimulation in anaesthetized rats.

a, Current measured at each potential during successive scans with electrical MFB stimulation (red line) in an anaesthetized rat. b, Current–voltage plot from the 5 scans during the 0.5-s period after MFB stimulation. The large increase in current around 0.6 V (dotted line, peak) corresponds to the dopamine redox potential measured in vitro. c, Time course of current at the dopamine redox potential around electrical MFB stimulation (red line). d–f, Plots, as in a–c, illustrating dopamine response in the same rat, now awake, to room lights being turned on. g, Current changes averaged over all trials (n = 4,418 trials) in which ramping occurred (see Methods) during T-maze running. h, Average current–voltage plot for all identified ramping trials for the time period (−3 to −2 s) indicated by the brackets in the colour plot in g. i, Average current changes induced by tonic MFB stimulation in anaesthetized rats (n = 3). j, Average current–voltage plot from the bracketed time range (10 to 11 s) in i following the onset of stimulation.

Extended Data Figure 3 Patterns of dopamine signals in the VMS and DLS.

a, b, Dopamine concentration recorded in the VMS (a, n = 9 rats) and in the DLS (b, n = 8 rats). Data were first averaged across trials to yield session average traces for each probe in each session. These traces were averaged within rat to obtain one average trace per rat, which were then averaged across rats. These plots differ from those in Fig. 1e, f, which considered session averages for each probe to be an independent measure. Shading represents s.e.m across rats. c, d, Distribution of average peak dopamine values for all recordings in VMS (c) and DLS (d). Each colour corresponds to an average peak dopamine concentration measured by a single probe in different sessions. e, f, Proportion of trial averaged dopamine recordings in the VMS (e, out of 300) and DLS (f, out of 262) that displayed a positive (blue) or negative (grey) ramping response during maze running, and an unclassified dopamine profile (red). g, h, Average dopamine concentration in the VMS (g) and DLS (h) for the positive ramping traces (top), the negative ramping traces (middle), and the unclassified traces (bottom). Shading represents s.e.m.

Extended Data Figure 4 Transient responses to cues and rewards occur during the T-maze task performance and are distinct from the sustained ramping response.

a, Dopamine concentration in a representative trial that included both a phasic response to warning click and a sustained ramping response to goal-reaching. b, c, Average normalized dopamine traces from VMS (b) and DLS (c) probes from all trials (n = 890 and 640, respectively) that showed identified transients after warning click (see Methods). In the VMS, note the sharp increase in dopamine around warning click superimposed on the ramping response that followed the phasic click response. Shading represents s.e.m. calculated across trials. d, Model for dopamine release profiles in the T-maze task. Sharper transient responses are present at the start of maze running (red) and after goal-reaching (cyan). These responses can be superimposed on and modulated independently of the slower ramping signal related to goal proximity (dark blue).

Extended Data Figure 5 Dopamine signals in response to free reward outside of the task.

a, Average peak dopamine to unexpected chocolate milk delivery outside the task is positively correlated with peak ramping dopamine measured from the same probes during preceding behavioural training in the maze (n = 146 sessions; Pearson’s R = 0.45, P < 0.0001). b, Average peak dopamine concentration induced by unexpected free reward outside the task (blue) and peak amplitude of dopamine ramping during maze performance just before reward (red; paired t-test, P < 0.001).

Extended Data Figure 6 Peak ramping dopamine values do not vary as a function of trial time.

a, Dopamine release modelled as a summation of four weighted transients in response to fixed maze events on short trials (purple) and long trials (orange). The overlap of the transients is reduced on the long trials, resulting in a lower peak dopamine level at the end of the maze run. Thick lines indicate overall average dopamine, and thin lines indicate the averages of each of the 4 transients across 100 simulation runs. b, Relative predicted peak dopamine levels on short and long trials calculated as a linear decay function of trial duration for the simulated model shown in a (black), for the spatial proximity model (light blue), and for the actual experimental data (dark blue). c, An alternative multi-transient model in which the transients (3, inset) are heavily weighted towards the goal location, are highly variable in their time of occurrence, and display a long decay time-course. In this model, the difference between short and long trials is within the noise range of the data. The average of individual transients (inset) across multiple simulations is a smeared version of the single transients that is weighted towards the goal location. d, Data plotted as in b, for the alternative multi-transient model. e, f, Average run speed (e) and acceleration (f) during short (purple) and long (orange) trials, as shown in Fig. 2, for all animals and sessions.

Extended Data Figure 7 Examples of trials with pausing or exploration near the choice point of the T-maze.

a–c, Video tracker traces (a), relative proximity to reward (b), and dopamine concentration (c) measured during a single trial in which a rat paused near the choice point. d–f, Video tracker traces (d), relative proximity to reward (e) and dopamine concentration (f) measured during another trial from a different rat.

Extended Data Figure 8 Value manipulation experiments in the T-maze, M-maze, and S-maze for all rats.

a–c, Average peak dopamine levels for M-maze sessions for three individual rats (M31, M36, and M47) in the left (blue) and right (red) end-arms. Blue and red shading indicates sessions in which left and right arm contained the larger reward, respectively. Error bars indicate s.e.m. d, e, Average peak dopamine levels, as in a–c, for value bias T-maze sessions for two rats (M36 and M47). f, g, Average peak dopamine levels, as in a–c, for value bias S-maze sessions for 2 rats (M47 and M35). h–j, Average normalized dopamine levels measured in the high reward arm (light green) and low reward arm (dark green) as rats performed the M-maze (h), T-maze (i) and S-maze (j) tasks. k, l, Dopamine concentration relative to left (k) and right (l) goal-reaching during the first session following a reversal of reward values (session 17 of M47, indicated by an asterisk in c).

Extended Data Figure 9 Dopamine signals recorded in two rats trained on the S-maze task.

a, Average normalized dopamine concentrations measured from VMS probes in one rat (n = 5 sessions) performing the S-maze task as in Fig. 3j. Light green line indicates runs to the higher reward goal, and dark green lines to the lower reward goal. Shading indicates s.e.m. Red vertical lines indicate turns. b, Traces, as in a, for the second rat trained on the S-maze task (n = 4 sessions).

Extended Data Figure 10 Dopamine end-arm preferences become positively correlated with arm choices with training, but ramping dopamine signals do not change significantly with performance improvement.

a, b, Distribution of selectivity indices, as in Fig. 4d, for all probes implanted in the left (a, n = 5 rats) and right (b, n = 3 rats) hemispheres. Note the bias in both groups of the selectivity preference towards negative selectivity indices (right bias, red) relative to the shuffled data (blue). c–e, Biases in average run time (c), percentage of correct responses (d), and arm choices (e) across training blocks. Negative values indicate biases towards the right end-arm. f, Raw average dopamine selectivity indices across training blocks. Note emergence of right bias with training. g, Correlation coefficients (Pearson’s R) computed for each training block between arm choice selectivity indices and dopamine selectivity indices. Error bars indicate confidence limits of the correlations. h, Normalized peak magnitudes of dopamine signals averaged in a 0.5-s window before goal-reaching in sessions with significant (Mann–Whitney U-test, P < 0.05) pre-goal increases. Data are averaged across rats for each training block (for which, left to right, n = 48, 101, 113 and 179 trial averaged recordings, respectively). i, Average (± s.e.m.) dopamine concentration from ramping dopamine sessions in which percentage of correct trials fell above (red, n = 179) or below (blue, n = 92) the learning criterion for T-maze task acquisition (72.5% correct, chi-square test, P < 0.05). j, Average (± s.e.m.) peak dopamine levels from the sessions plotted in i, showing no significant difference between pre- and post-learning periods (t-test, P = 0.44).

Supplementary information

Supplementary Information

This file contains a Supplementary Discussion and additional references. (PDF 265 kb)

PowerPoint slides

PowerPoint slide for Fig. 1

PowerPoint slide for Fig. 2

PowerPoint slide for Fig. 3

PowerPoint slide for Fig. 4

Rights and permissions

Reprints and permissions

About this article

Cite this article

Howe, M., Tierney, P., Sandberg, S. et al. Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500, 575–579 (2013). https://doi.org/10.1038/nature12475

Download citation

Received: 25 January 2013
Accepted: 17 July 2013
Published: 04 August 2013
Issue Date: 29 August 2013
DOI: https://doi.org/10.1038/nature12475

This article is cited by

State and rate-of-change encoding in parallel mesoaccumbal dopamine pathways
- Johannes W. de Jong
- Yilan Liang
- Stephan Lammel
Nature Neuroscience (2024)
Dopamine transients follow a striatal gradient of reward time horizons
- Ali Mohebi
- Wei Wei
- Joshua D. Berke
Nature Neuroscience (2024)
Acetylcholine waves and dopamine release in the striatum
- Lior Matityahu
- Naomi Gilin
- Joshua A. Goldberg
Nature Communications (2023)
Environmental context-dependent activation of dopamine neurons via putative amygdala-nigra pathway in macaques
- Kazutaka Maeda
- Ken-ichi Inoue
- Okihide Hikosaka
Nature Communications (2023)
Distinct sub-second dopamine signaling in dorsolateral striatum measured by a genetically-encoded fluorescent sensor
- Armando G. Salinas
- Jeong Oen Lee
- David M. Lovinger
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.