Continuous theta-burst stimulation (cTBS) over the lateral prefrontal cortex alters reinforcement learning bias
Research highlights
► Lateral prefrontal cTBS changes reward- and punishment sensitivity. ► Opposite effects of left- vs. right prefrontal cTBS. ► Enhanced striatal (reward) prediction error coding after left-hemispheric stimulation.
Introduction
Learning from trial and error is one of the most fundamental mechanisms for acquiring both explicit and implicit knowledge of the world and has been studied in a broad range of species. In mammals, the basal ganglia seem to play an essential role in enhancing actions that proved likely to result in desirable outcomes (Graybiel, 2005). Dopaminergic neurons in the ventral tegmental area (VTA) and the pars compacta of the Substantia nigra (SNc) project to the basal ganglia and respond with increases and decreases in their firing rate as a consequence of rewarding and aversive stimuli, respectively (Ungless et al., 2004). Striatal dopamine (DA) plays a crucial role in synaptic plasticity (Shen et al., 2008); it is believed to improve behavioral outcome by both strengthening circuits implicated in successful actions and suppressing interfering and inefficient signalling. However, the firing rate of dopaminergic neurons is not simply sensitive to the overall reward (or punishment). Rather, it codes for the relative outcome in light of the anticipation that is generated on the basis of previous experience. Only if the result is better than expected (i.e., in the case of a positive reward prediction error), the firing rate of these neurons will increase. In contrast, outcomes that do not meet expectations (a negative reward prediction error) decrease their activity (Schultz et al., 1997). Healthy individuals will normally learn to both repeat beneficial actions and avoid adverse actions in a balanced manner. However, a study with patients suffering from Parkinson's disease (PD) showed that, under DA depletion (off-medication), learning was mainly driven by avoiding detrimental choices. Conversely, after L-DOPA medication, leading to unphysiologically high DA concentration within the ventral striatum, learning was dominated by a preference for the rewarded options (Frank et al., 2004). In two recent studies, we found that genetically determined differences in DA-D2-receptor density predicted the degree of approach vs. avoidance learning in a healthy population (Klein et al., 2007b), as well as the ability to maintain new action-reward associations in a reversal learning task (Jocham et al., 2009). Furthermore, we found that low dose D2 receptor antagonism (presumed to increase striatal DA release) enhanced both learning from rewarding outcomes and striatal prediction error coding (Jocham et al., 2011). Therefore, it seems likely that (over-) efficient dopaminergic signaling biases organisms toward learning from rewards, rather than punishments.
Recently, several studies have shown that repetitive transcranial magnetic stimulation (rTMS) applied to the prefrontal cortex can trigger the release of dopamine in both the striatum (Ko et al., 2008, Pogarell et al., 2007, Pogarell et al., 2006, Strafella et al., 2001) and in distant prefrontal areas (Cho and Strafella, 2009).
A surprising finding has been that stimulation of the left but not the right dorsolateral prefrontal cortex (DLPFC) leads to increased striatal DA (Strafella et al., 2001, Pogarell et al., 2007, Pogarell et al., 2006) in studies using both positron emission tomography (PET) and single photon emission computed tomography (SPECT). The magnitude of DA release was comparable to chemical challenge with amphetamine (Pogarell et al., 2007) and paralleled by impaired performance in the Montréal Card Sorting Test (MCST), a task requiring cognitive set-shifting (Ko et al., 2008). So far, the exact mechanism by which a supposedly inhibitory stimulation of the left prefrontal cortex would lead to a release of DA in the striatum, is unknown.
Here, we investigate the interaction of functional changes in the DLPFC induced by continuous theta-burst (cTBS) transcranial magnetic stimulation with subjects' performance in a probabilistic learning paradigm. Healthy volunteers received cTBS to the left or right DLPFC or to a control region. Immediately after stimulation they performed an adapted version of the probabilistic learning task originally published by Frank et al.(2004) in a functional magnetic resonance imaging (fMRI) environment. To avoid across-session learning effects a between subject design was chosen.
Assuming that stimulation to the left DLPFC would lead to increased dopamine release in the basal ganglia, we hypothesized a shift of learning strategy toward a more reward-guided behavior as compared to a control group. However, since our target region is also implicated in working memory (Amiez and Petrides, 2007, Wager and Smith, 2003), an overall detrimental effect on learning would also be conceivable — supposing that subjects were impaired in recalling and updating reward history. A second group was stimulated to the right DLPFC to control for discomfort of cTBS. According to the literature, there should be no effect on dopaminergic signaling and therefore no shift in learning strategy. Yet, some authors do attribute an important role in avoidance behavior to the right DLPFC by establishing negative stimulus-response contingencies, especially under circumstances of ambiguity (Christakou et al., 2009), so that an effect of right hemispherical stimulation on learning behavior could not be excluded. Therefore, a third group was stimulated to the vertex, which was expected to have no effect at all.
Section snippets
Participants
A total number of 47 subjects between 20 and 30 years of age participated in the study. All were right-handed according to the Edinburgh Handedness Inventory (Oldfield, 1971) and free of neurological or psychiatric illnesses. Subjects gave informed written consent and the experimental protocol was approved by the local ethics committee. One subject had to be excluded for medical reasons, 3 did not perform the task properly (performance at chance level) and, finally, one subject had to be
Training phase
All subjects understood the probabilistic nature of the paradigm and learned it sufficiently well, as can be seen from performance on the AB-trials (mean percent choose A: 92%; ± 1.5 (SEM); no group difference: ANOVA, F (2, 39) = .82, p = .45,) in the training phase. Performance at choosing the better symbol developed over time. Repeated-measures ANOVA yielded a main effect of bin (F (14,1638) = 19.7, p = .000, Fig. 2A). No influence of cTBS or a cTBS × bin interaction could be observed (all p > .32). There
Discussion
In the present study, we applied cTBS to the prefrontal cortex, which had been shown to increase dopamine release in the ventral striatum (Ko et al., 2008, Strafella et al., 2001). While having no effect on overall learning efficiency, stimulation induced changes in the strategy for achieving optimal performance: after left-dorsolateral prefrontal cTBS, volunteers exhibited a significant bias toward approach learning, whereas right-hemispherical stimulation resulted in a trend toward avoidance
Acknowledgments
We thank Berit Streubel, Ramona Menger, Anke Kummer, Annett Wiedemann, and Simone Wipper for their indispensable help in data collection. JN's work was supported by the Federal Ministry of Education and Research (BMBF), Germany, FKZ: 01EO1001.
GJ's work was supported by the German Research Foundation (DFG), Germany, JO 787/1-1.
References (33)
- et al.
Continuous theta burst stimulation of right dorsolateral prefrontal cortex induces changes in impulsivity level
Brain Stimul.
(2010) - et al.
Analysis of fMRI time-series revisited
Neuroimage
(1995) - et al.
Event-related fMRI: characterizing differential responses
Neuroimage
(1998) The basal ganglia: learning new tricks and loving it
Curr. Opin. Neurobiol.
(2005)- et al.
Theta burst stimulation of the human motor cortex
Neuron
(2005) - et al.
The after-effect of human theta burst stimulation is NMDA receptor dependent
Clin. Neurophysiol.
(2007) - et al.
Neural correlates of error awareness
Neuroimage
(2007) - et al.
LIPSIA—a new software system for the evaluation of functional magnetic resonance images of the human brain
Comput. Med. Imaging Graph.
(2001) - et al.
Repetitive TMS over the human oculomotor cortex: comparison of 1-Hz and theta burst stimulation
Neurosci. Lett.
(2006) The assessment and analysis of handedness: the Edinburgh inventory
Neuropsychologia
(1971)