Elsevier

NeuroImage

Volume 57, Issue 2, 15 July 2011, Pages 617-623
NeuroImage

Continuous theta-burst stimulation (cTBS) over the lateral prefrontal cortex alters reinforcement learning bias

https://doi.org/10.1016/j.neuroimage.2011.04.038Get rights and content

Abstract

The prefrontal cortex is known to play a key role in higher-order cognitive functions. Recently, we showed that this brain region is active in reinforcement learning, during which subjects constantly have to integrate trial outcomes in order to optimize performance. To further elucidate the role of the dorsolateral prefrontal cortex (DLPFC) in reinforcement learning, we applied continuous theta-burst stimulation (cTBS) either to the left or right DLPFC, or to the vertex as a control region, respectively, prior to the performance of a probabilistic learning task in an fMRI environment. While there was no influence of cTBS on learning performance per se, we observed a stimulation-dependent modulation of reward vs. punishment sensitivity: Left-hemispherical DLPFC stimulation led to a more reward-guided performance, while right-hemispherical cTBS induced a more avoidance-guided behavior. FMRI results showed enhanced prediction error coding in the ventral striatum in subjects stimulated over the left as compared to the right DLPFC. Both behavioral and imaging results are in line with recent findings that left, but not right-hemispherical stimulation can trigger a release of dopamine in the ventral striatum, which has been suggested to increase the relative impact of rewards rather than punishment on behavior.

Research highlights

► Lateral prefrontal cTBS changes reward- and punishment sensitivity. ► Opposite effects of left- vs. right prefrontal cTBS. ► Enhanced striatal (reward) prediction error coding after left-hemispheric stimulation.

Introduction

Learning from trial and error is one of the most fundamental mechanisms for acquiring both explicit and implicit knowledge of the world and has been studied in a broad range of species. In mammals, the basal ganglia seem to play an essential role in enhancing actions that proved likely to result in desirable outcomes (Graybiel, 2005). Dopaminergic neurons in the ventral tegmental area (VTA) and the pars compacta of the Substantia nigra (SNc) project to the basal ganglia and respond with increases and decreases in their firing rate as a consequence of rewarding and aversive stimuli, respectively (Ungless et al., 2004). Striatal dopamine (DA) plays a crucial role in synaptic plasticity (Shen et al., 2008); it is believed to improve behavioral outcome by both strengthening circuits implicated in successful actions and suppressing interfering and inefficient signalling. However, the firing rate of dopaminergic neurons is not simply sensitive to the overall reward (or punishment). Rather, it codes for the relative outcome in light of the anticipation that is generated on the basis of previous experience. Only if the result is better than expected (i.e., in the case of a positive reward prediction error), the firing rate of these neurons will increase. In contrast, outcomes that do not meet expectations (a negative reward prediction error) decrease their activity (Schultz et al., 1997). Healthy individuals will normally learn to both repeat beneficial actions and avoid adverse actions in a balanced manner. However, a study with patients suffering from Parkinson's disease (PD) showed that, under DA depletion (off-medication), learning was mainly driven by avoiding detrimental choices. Conversely, after L-DOPA medication, leading to unphysiologically high DA concentration within the ventral striatum, learning was dominated by a preference for the rewarded options (Frank et al., 2004). In two recent studies, we found that genetically determined differences in DA-D2-receptor density predicted the degree of approach vs. avoidance learning in a healthy population (Klein et al., 2007b), as well as the ability to maintain new action-reward associations in a reversal learning task (Jocham et al., 2009). Furthermore, we found that low dose D2 receptor antagonism (presumed to increase striatal DA release) enhanced both learning from rewarding outcomes and striatal prediction error coding (Jocham et al., 2011). Therefore, it seems likely that (over-) efficient dopaminergic signaling biases organisms toward learning from rewards, rather than punishments.

Recently, several studies have shown that repetitive transcranial magnetic stimulation (rTMS) applied to the prefrontal cortex can trigger the release of dopamine in both the striatum (Ko et al., 2008, Pogarell et al., 2007, Pogarell et al., 2006, Strafella et al., 2001) and in distant prefrontal areas (Cho and Strafella, 2009).

A surprising finding has been that stimulation of the left but not the right dorsolateral prefrontal cortex (DLPFC) leads to increased striatal DA (Strafella et al., 2001, Pogarell et al., 2007, Pogarell et al., 2006) in studies using both positron emission tomography (PET) and single photon emission computed tomography (SPECT). The magnitude of DA release was comparable to chemical challenge with amphetamine (Pogarell et al., 2007) and paralleled by impaired performance in the Montréal Card Sorting Test (MCST), a task requiring cognitive set-shifting (Ko et al., 2008). So far, the exact mechanism by which a supposedly inhibitory stimulation of the left prefrontal cortex would lead to a release of DA in the striatum, is unknown.

Here, we investigate the interaction of functional changes in the DLPFC induced by continuous theta-burst (cTBS) transcranial magnetic stimulation with subjects' performance in a probabilistic learning paradigm. Healthy volunteers received cTBS to the left or right DLPFC or to a control region. Immediately after stimulation they performed an adapted version of the probabilistic learning task originally published by Frank et al.(2004) in a functional magnetic resonance imaging (fMRI) environment. To avoid across-session learning effects a between subject design was chosen.

Assuming that stimulation to the left DLPFC would lead to increased dopamine release in the basal ganglia, we hypothesized a shift of learning strategy toward a more reward-guided behavior as compared to a control group. However, since our target region is also implicated in working memory (Amiez and Petrides, 2007, Wager and Smith, 2003), an overall detrimental effect on learning would also be conceivable — supposing that subjects were impaired in recalling and updating reward history. A second group was stimulated to the right DLPFC to control for discomfort of cTBS. According to the literature, there should be no effect on dopaminergic signaling and therefore no shift in learning strategy. Yet, some authors do attribute an important role in avoidance behavior to the right DLPFC by establishing negative stimulus-response contingencies, especially under circumstances of ambiguity (Christakou et al., 2009), so that an effect of right hemispherical stimulation on learning behavior could not be excluded. Therefore, a third group was stimulated to the vertex, which was expected to have no effect at all.

Section snippets

Participants

A total number of 47 subjects between 20 and 30 years of age participated in the study. All were right-handed according to the Edinburgh Handedness Inventory (Oldfield, 1971) and free of neurological or psychiatric illnesses. Subjects gave informed written consent and the experimental protocol was approved by the local ethics committee. One subject had to be excluded for medical reasons, 3 did not perform the task properly (performance at chance level) and, finally, one subject had to be

Training phase

All subjects understood the probabilistic nature of the paradigm and learned it sufficiently well, as can be seen from performance on the AB-trials (mean percent choose A: 92%; ± 1.5 (SEM); no group difference: ANOVA, F (2, 39) = .82, p = .45,) in the training phase. Performance at choosing the better symbol developed over time. Repeated-measures ANOVA yielded a main effect of bin (F (14,1638) = 19.7, p = .000, Fig. 2A). No influence of cTBS or a cTBS × bin interaction could be observed (all p > .32). There

Discussion

In the present study, we applied cTBS to the prefrontal cortex, which had been shown to increase dopamine release in the ventral striatum (Ko et al., 2008, Strafella et al., 2001). While having no effect on overall learning efficiency, stimulation induced changes in the strategy for achieving optimal performance: after left-dorsolateral prefrontal cTBS, volunteers exhibited a significant bias toward approach learning, whereas right-hemispherical stimulation resulted in a trend toward avoidance

Acknowledgments

We thank Berit Streubel, Ramona Menger, Anke Kummer, Annett Wiedemann, and Simone Wipper for their indispensable help in data collection. JN's work was supported by the Federal Ministry of Education and Research (BMBF), Germany, FKZ: 01EO1001.

GJ's work was supported by the German Research Foundation (DFG), Germany, JO 787/1-1.

References (33)

Cited by (0)

View full text