Dopamine D2/D3 Receptors Play a Specific Role in the Reversal of a Learned Visual Discrimination in Monkeys

Lee, Buyean; Groman, Stephanie; London, Edythe D; Jentsch, James David

doi:10.1038/sj.npp.1301337

Download PDF

Original Article
Published: 14 February 2007

Dopamine D₂/D₃ Receptors Play a Specific Role in the Reversal of a Learned Visual Discrimination in Monkeys

Buyean Lee¹,
Stephanie Groman¹,
Edythe D London^1,2,3 &
…
James David Jentsch^3,4

Neuropsychopharmacology volume 32, pages 2125–2134 (2007)Cite this article

1983 Accesses
119 Citations
Metrics details

Abstract

Converging evidence supports a role for mesocorticolimbic dopaminergic systems in a subject's ability to shift behavior in response to changing stimulus-reward contingencies. To characterize the dopaminergic mechanisms involved in this function, we quantified the effects of subtype-specific dopamine (DA) receptor antagonists on acquisition, retention, and reversal of a visual discrimination task in non-human primates (Chlorocebus aethiops sabaeus). We used a modified Wisconsin General Test Apparatus that was equipped with three food boxes, each fitted with a lid bearing a unique visual cue; one of the cues concealed a food reward, whereas the other two concealed an empty box. The monkeys were trained first to acquire a novel discrimination (eg A⁺, B⁻, C⁻) in a single session, before experiencing either a reversal of the discrimination (eg A⁻, B⁺, C⁻) or the acquisition of a completely new discrimination (eg D⁺, E⁻, F⁻), on the following day. Systemic administration of the D₂/D₃ receptor antagonist raclopride (0.001–0.03 mg/kg) failed to significantly affect the performance of reversal learning when reversal sessions were run without a retention session. But, raclopride (0.03 mg/kg) significantly impaired performance under the reversal condition when reversal sessions were run right after a retention session; however, it did not affect acquisition of a novel visual discrimination. Specifically, raclopride significantly increased the number of reversal errors made before reaching the performance criterion in the reversal, but not in new learning sessions. In contrast, the D₁/D₅ receptor antagonist SCH 23390 did not significantly modulate acquisition of a novel discrimination or reversal learning at doses (0.001–0.03 mg/kg, i.m.) that did not suppress behavior generally. In addition, none of the drug treatments affected retention of a previously learned discrimination. The results strongly suggest that D₂/D₃ receptors, but not D₁/D₅ receptors, selectively mediate reversal learning, without affecting the capacity to learn a new stimulus-reward association. These data support the hypothesis that phasic DA release, acting through D₂-like receptors, mediates behavioral flexibility.

Dorsal and ventral striatal dopamine D1 and D2 receptors differentially modulate distinct phases of serial visual reversal learning

Article Open access 15 January 2020

Júlia Sala-Bayo, Leanne Fiddian, … Trevor W. Robbins

Dopamine D2 receptors in discrimination learning and spine enlargement

Article 18 March 2020

Yusuke Iino, Takeshi Sawada, … Sho Yagishita

Striatal dopamine dissociates methylphenidate effects on value-based versus surprise-based reversal learning

Article Open access 24 August 2022

Ruben van den Bosch, Britt Lambregts, … Roshan Cools

INTRODUCTION

A reduced capacity to withhold, modify, or sustain adaptive behavior, when demanded by environmental circumstances, is a critical sequela of a number of psychiatric disorders, most notably schizophrenia and substance abuse. With respect to the latter, the transition from substance use to compulsive substance abuse is marked by a reduction of the rewarding properties of the drug and an increase in drug-associated risks and negative consequences (Robinson and Berridge, 2003; Everitt and Robbins, 2005). Despite this progressive change in the contingent consequences of drug-taking behavior, many individuals are unable to inhibit or change their behavior (in this case, their drug-taking). It is therefore of great interest to understand the mechanisms by which organisms can flexibly and dynamically inhibit and/or change behavior in response to shifts in feedback; these capacities are frequently included within the rubric of ‘executive functions.’

In laboratory animals and human subjects, the ability to modify behavior adaptively in response to dynamically changing stimulus-reward contingencies can be studied using reversal learning tasks. In these procedures, animals learn a novel discrimination (eg cue A is contingently associated with reward, whereas cues B and C are not associated or neutral) before experiencing a reversal (in which reinforcement is now associated with either cue B or C, but not A). Efficient reversal learning requires the ability to recognize the shift in contingency, to inhibit a learned response to the previously rewarded cue, to overcome learned irrelevance for the previously negative cues, and to acquire a new associative structure to guide response. By comparing performance of a subject during reversal to performance under conditions in which it acquires a completely new discrimination (rather than a reversal), it is possible to isolate the aptitude to overcome the prepotent behavior (response inhibition) and the learned irrelevance.

Dopamine (DA) is a catecholamine neurotransmitter that has been substantially implicated in cognitive and executive functions. Prefrontal regions of the cerebral cortex receive a particularly dense dopaminergic input from the ventral mesencephalic DA cell nuclei (Williams and Goldman-Rakic, 1993, 1998), although this input is minor in comparison with the dopaminergic innervation of the basal ganglia. Using a reversal learning task, Ridley et al (1981) originally reported that systemic administration of haloperidol, a potent DA receptor antagonist, impaired the ability to change behavior in response to contingency shifts, suggesting that reductions in DA transmission impair one form of behavioral flexibility. Experimentally induced dysregulation of central DA has also been shown to impair response inhibition, as assessed using an object retrieval/detour task (Jentsch et al, 1997; Lyons et al, 2000). Taken together, indirect evidence may suggest that mesocortical DA systems do regulate adaptive responding in tasks, such as reversal learning procedures.

Although a general involvement of DA systems in reversal learning is known, the specific receptor mechanisms and anatomical target locations underlying these effects are unclear. DA receptors can be categorized as belonging to either the D₁-like or the D₂-like families. The D₁-like DA receptors include the D₁ and D₅ subtypes; both of which are linked to activation of adenylyl cyclases. The D₂-like DA receptors comprise the D₂, D₃, and D₄ receptors; they all inhibit adenylyl cyclases (Kebabian and Calne, 1979). The results of Ridley et al (1981) support a strong involvement of the D₂-like receptors, as haloperidol has higher affinity for this subtype than the others. In addition, Floresco et al (2006) demonstrated that administration of the D₂/D₃ subtype selective antagonist, eticlopride, potently impaired the ability of rats to change their behavior in response to a conditional change of rule in a set-shifting task. More direct evidence in support of these results derives from a recent study of D₂ receptor gene knockout mice. Selective deletion of the D₂ receptor gene produced a dramatic deficit of performance in reversal learning of odor discrimination, although basic associative mechanisms underlying this behavior were also impaired (Kruzich and Grandy, 2004). In contrast, the role of the D₁-like receptors in behavioral flexibility is less clear. Although blockade of D₁-like receptors in the prefrontal cortex (PFC) impaired performance of a set-shifting task, stimulation of the D₁-like receptors in the PFC did not (Ragozzino, 2002; Floresco et al, 2006). However, it is simply unknown whether the D₁-like receptors play a role in reversal learning, although the D₁-like receptors are critically involved in other cognitive functions, such as working memory (Sawaguchi and Goldman-Rakic, 1991; Williams and Goldman-Rakic, 1995) and visual attention (Granon et al, 2000). Therefore, DA receptor subtypes appear to contribute in very different ways to the spectrum of cognitive and executive functions.

Based upon these pieces of evidence, we set out to determine the relative involvement of the D₁-like and D₂-like receptors in reversal learning performance in Old World monkeys. We further sought to examine whether any observed drug effects were specific for reversal learning or whether they might also be seen in a new learning situation. To accomplish these goals, we administered the D₂/D₃ specific receptor antagonist raclopride to monkeys performing a visual discrimination and reversal learning task. Cognitive and pharmacological specificity of the effects of raclopride on reversal learning was investigated in two ways. First, we tested whether blockade of the D₂/D₃ receptors produces an impairment that depends upon the need to overcome the learned response or upon a general set of associative learning mechanisms. We hypothesized that D₂/D₃ receptor antagonism specifically impairs response inhibition, and therefore, that drug effects would be evident under reversal, but not new learning, conditions. Second, we tested whether SCH 23390, the D₁-like specific receptor antagonist, produces impairment similar to raclopride. We hypothesized that SCH 23390 would generally impair performance, affecting both reversal and new learning (Parkinson et al, 2002; Eyny and Horvitz, 2003).

MATERIALS AND METHODS

Subjects

Four adult male vervet monkeys (Chlorocebus aethiops sabaeus from the UCLA Vervet Research Colony), weighing 6.0–7.4 kg, were used in these studies. The subjects were 7–10 years of age at the time of testing. All animals were housed individually in a climate-controlled vivarium, where they had unlimited access to water and received a nutritionally balanced diet of monkey chow (Teklad^®, Madison, WI) supplemented with fresh fruit. The monkeys received their full daily allotment of food (which was never reduced in order to support behavioral performance) immediately after morning testing (1/2 portion) and again at 1600–1700 hours (1/2 portion).

All animals were maintained in accordance with the ‘Guide for the Care and Use of Laboratory Animals’ of the Institute of Laboratory Animal Resources, National Research Council, Department of Health, Education and Welfare Publication No. (NIH) 85–23, revised 1996. Research protocols were approved by the UCLA Chancellor's Animal Research Committee.

General Methods

A modified Wisconsin General Test Apparatus (WGTA) consisted of an opaque screen (that could be raised and lowered) that separated the monkey from a tray fitted with three opaque square food boxes with hinged opaque lids. The tops and insides of the food boxes were fitted with distinctive, colored pictures (clip art from the Microsoft Office^® library), and the monkeys were able to open the lids of food boxes easily to retrieve food rewards (bits of apple, grape, or peanuts) hidden within.

The monkeys, which had previously been trained to move voluntarily from their cages to a transport cart, were moved to a sound-attenuated testing room. In the testing room, the transport cart was aligned to the WGTA, so that monkeys could easily access the food boxes on the tray when the opaque screen was raised. The monkeys were presented with trials on which a screen was raised to reveal three food boxes with visually distinctive cues (pictures). They were allowed to open only one box lid per trial. Each trial lasted either until the animal open a lid or 2 min passed, whichever came first. The inter-trial interval was approximately 15 s. For each set of three visual cues, the picture that was chosen to be the ‘positive’ stimulus was varied across the four subjects. The position of the food boxes varied pseudo-randomly across all trials. Each subject was tested between 0800 and 1030 hours on each day.

Raclopride (D₂/D₃) Experiments

Reversal learning without retention session

In an initial experiment, all monkeys were trained to acquire and reverse novel visual discriminations in a between-session design. For example, on day 1 (Mondays and Thursdays), the monkeys were presented with three visual cues that they never saw before and were trained to learn that one cue was associated with reward, whereas the other two were not (discrimination acquisition session). Once again, the animals were required to select one cue per trial and to determine, solely by trial and error, which cue was associated with reward. The acquisition session continued until monkeys reached a predetermined performance criterion (at least seven correct responses in 10 consecutive trials) or had completed 80 trials. The following day (Tuesdays and Fridays), monkeys were tested in a condition in which the reward was associated with one of the two previously non-rewarded cues and the previously positive stimulus was not associated with reward (reversal session). The reversal session continued until monkeys reached criterion (seven selections of new positive stimulus in 10 consecutive trials) or up to 80 trials. The number of trials to reach criterion, the number of selections of negative stimuli (reversal errors), and the number of selections of the previously rewarded stimulus (perseverative errors) committed during the reversal session were the dependent measures.

All subjects were treated via intra-muscular injection with raclopride-L-tartrate (Sigma, St Louis, MO; 0.001–0.03 mg/kg) or its vehicle 20 min before testing. The dose range for raclopride was chosen because it spans doses that have been shown to (1) impair working memory and paired associates learning in rhesus monkeys (Von Huben et al, 2006); (2) block the cognitive impairments caused by the D₂ agonist quinpirole in rhesus monkeys (Arnsten et al, 1995); and (3) attenuate cocaine self-administration at 0.03 mg/kg in monkeys (Campbell et al, 1999). The vehicle was sterile isotonic saline. Drug weights were calculated as the salt. The order of the drug treatment conditions was balanced according to a Latin Squares design across the subjects.

Reversal learning with a retention session

In follow-up experiments, we included a short retention session before reversal on the appropriate testing days (Tuesdays and Fridays). This modification accomplished two goals. First, because the difficulty in reversal may be related to the strength of the ‘memory’ of the previously learned discrimination, we wanted to ensure that any drug effects on reversal performance were not affected by changes in retention of the learned discrimination. Second, the retention session probably serves as a reminder of the previous discrimination and therefore increases prepotency. On Mondays and Thursdays, monkeys were trained on a novel discrimination as described above (acquisition session). On the following days, the monkeys were tested under the same stimulus-reward contingencies as encountered in the acquisition session (the previously positive stimulus was still rewarded). The retention session included a sufficient number of trials until the subjects achieved four correct responses out of five consecutive trials. Once monkeys reached criterion, the reversal was initiated, as described above. Testing was continued until the animals made seven correct choices in 10 consecutive trials. One week before raclopride tests, all monkeys participated in two reversal sessions with retention to expose them to this new condition.

The monkeys were treated via intra-muscular injections with raclopride (0.03 mg/kg) or its vehicle 20 min before retention/reversal sessions. The order of the treatment conditions was balanced across all subjects. For this follow-up study, only the highest dose of raclopride was tested.

Reversal vs new learning

Because reversal learning requires both inhibition of a prepotent response and the learning of a new stimulus-reward contingency, drug-induced impairments of reversal could reflect a basic defect in learning. We, therefore, sought to test the selectivity of the behavioral effects of raclopride by determining whether the drug affects the ability to learn a completely new stimulus-reward association. Monkeys were trained on a visual discrimination as described above (acquisition session) on Mondays and Thursdays. On the following days, they were tested for retention (precisely as described above); then, the monkeys were tested for either reversal of the previously learned discrimination or for acquisition of a completely new discrimination involving three pictures that they had never seen before.

Monkeys were treated i.m. with either raclopride (0.03 mg/kg) or its vehicle 20 min before the new learning or reversal sessions. With two drug conditions (saline vs raclopride) and two behavioral conditions (new learning vs reversal), we delivered four experimental sessions, the order of which was balanced across all subjects. Because raclopride impaired reversal learning performance with a retention session at the highest dose only (0.03 mg/kg), only this dose of raclopride was tested in the reversal vs new learning conditions.

SCH 23390 (D₁/D₅) Experiments

After finishing all tests with raclopride, the same subjects were tested to determine whether a D₁/D₅ receptor antagonist, with some affinity for 5-HT₂ receptors, SCH 23390 hydrochloride (St Louis, MO; 0.001–0.03 mg/kg) affects performance of reversal learning. The dose range of SCH23390 was chosen because it spans doses that attenuate FG-7142-induced cognitive deficits (Murphy et al, 1996) and impair delayed response performance in monkeys (Arnsten et al, 1994). To accomplish this goal, we utilized the testing conditions wherein retention was followed by reversal. Monkeys were treated via intra-muscular injection with SCH 23390 20 min before the start of retention session. The order of the drug treatment conditions were balanced across all the subjects, using a Latin Square design. We also tested whether SCH 23390 (0.03 mg/kg) affected the ability to learn new a completely new stimulus-reward association using the same procedure described in the Reversal vs new learning section of raclopride experiments. Because SCH 23390 did not impair reversal learning performance at any dose, the highest dose of SCH 23390 was selected to be tested in the reversal vs new learning conditions.

Data Analyses

The effects of drug treatment on performance of reversal learning were analyzed with paired t-test or one-way analysis of variance (ANOVA) followed by Dunnett's test. The effects of raclopride (0.03 mg/kg) and SCH 23390 (0.03 mg/kg) on reversal learning and new learning were initially analyzed with two-way ANOVA followed by a priori paired one-tailed t-tests to probe any significant main effects or interactions.

RESULTS

In reversal learning without a retention session, the monkeys acquired the association between stimulus and reward within 15.6±2.6 trials (mean±SE). On the subsequent day of testing for reversal, the D₂/D₃ antagonist raclopride (0.001–0.01 mg/kg, i.m.) did not significantly affect either the number of trials required to reach the preset performance criterion (F(3,9)=1.84, P=0.22), the number of reversal errors committed (F(3,9)=1.47, P=0.29), or the number of perseverative errors (F(3,9)=1.77, P=0.22) (Figure 1). In a follow-up experiment, the effects of saline were compared with those of a higher dose of raclopride, 0.03 mg/kg; again, raclopride did not significantly affect either the number of trials to criterion (t(3)=−0.20, P=0.85), the total number of reversal errors (t(3)=−0.78, P=0.49), or the number of perseverative errors (t(3)=0.09, P=0.94) (Figure 1). Over the range of raclopride doses studied, the monkeys did not omit any trials, indicating that raclopride did not cause nonspecific behavioral disruption or suppression.

In reversal learning with a retention session, the monkeys were exposed to the same association between stimulus and reward used during the acquisition session before experiencing the reversal. Raclopride (0.03 mg/kg) had no measured effects on retention of the previously learned discrimination, as indicated by a lack of effect on the number of trials to reach criterion (t(3)=−0.29, P=0.79) or the number of retention errors (t(3)=−0.26, P=0.84) (data not shown). When the monkeys were treated with raclopride (0.03 mg/kg), they did exhibit a significant impairment of performance in reversal learning. Specifically, raclopride significantly increased the number of reversal errors (t(3)=−3.21, P<0.05) during reversal sessions (Figure 2). Although the number of trials required to reach the preset performance criterion and the number of perseverative errors showed a tendency to be increased by raclopride treatment, these effects did not reach statistical significance (Figure 2).

Because of the possibility that raclopride might impair the associative mechanisms required during reversal (instead of only affecting behavioral flexibility), we tested whether raclopride would impair performance when the monkeys simply encountered a new set of cues about which they had to learn (ie new learning). Two-by-two ANOVA revealed main effects of drug (F(1,3)=10.14, P<0.05) and type of learning (new vs reversal, F(1,3)=13.96, P<0.05) for trials to criteria. For the number of errors, there were main effects of both drug (F(1,3)=12.79, P<0.05) and type of learning (F(1,3)=18.36, P<0.05). Separate one-tailed paired t-test confirmed that raclopride (0.03 mg/kg) impaired the performance of learning during reversal sessions; it significantly increased the number of trials to reach criterion, the number of reversal errors, and the number of perseverative errors (P<0.05; Figure 3). In contrast, raclopride had no measured effects on the ability of monkeys to learn a new stimulus-reward association, as indexed by the number of trials to criterion (t(3)=0.27, P=0.80) or the number of committed errors (t(3)=0.27, P=0.80).

The D₁/D₅ antagonist SCH 23390 (0.001–0.03 mg/kg, i.m.) did not significantly modulate performance of reversal learning with a retention session at any of the doses tested; there were no main effects of drug treatment in the number of trials to reach criterion (F(3,9)=0.82, P=0.51), the total number of reversal errors (F(3,9)=0.66, P=0.60), or the number of perseverative errors (F(3,9)=0.63, P=0.61) (Figure 4). We had planned to use a higher (0.1 mg/kg) dose of SCH 23390, but several of the monkeys were unable to perform at this dose because of bradykinesia and sedation.

The 0.03 mg/kg dose of SCH 23390 was also evaluated for its ability to impair the acquisition of a new stimulus-reward association—when a novel set of pictures was used. Two-way repeated measures ANOVA indicated significant main effects of type of learning in the number of trials to reach criterion (F(1,3)=28.17, P=0.01) and the number of errors (F(1,3)=26.44, P=0.01). However, there were no main effects of drug on the number of trials to reach criterion (F(1,3)=3.12, P=0.18) or on the number of errors committed (F(1,3)=2.61, P=0.21) (Figure 5).

DISCUSSION

In this study, blockade of the D₂/D₃ receptors, but not the D₁/D₅ receptors, increased the number of errors and the number of trials required to reach criterion under reversal conditions. The effect of raclopride on the acquisition of reversal was not due to a generalized impairment of the ability to learn associative relationships between stimuli and rewards, because raclopride did not impair the ability of the monkeys to learn a discrimination using novel stimuli. Our findings indicate that the D₂/D₃ receptors play a critical role in the ability of monkeys to shift behavior according to changing relationships between stimuli and reward. Notably, although SCH 23390, a D₁/D₅ receptor antagonist, produced impairment of spatial delayed response performance (a measure of working memory) in a prior study (Arnsten et al, 1994), this drug was without clear effect on reversal learning. These data suggest that there is a double-dissociation between D₁-like and D₂-like mechanisms in the regulation of working memory and executive-like functions of the frontostriatal system.

Behavioral Considerations

Visual discrimination and reversal learning has been used to measure the capacity to inhibit prepotent responses and to overcome previously learned irrelevance in non-human primates and humans. It is important to note that there are many factors that can directly modify the effects of pharmacological and lesion treatments on reversal learning. For example, differences in the apparatus (touch screen vs WGTA) and the discriminanda (objects vs pictures) have the potential to affect the sensitivity of reversal learning performance to lesions of frontal regions in monkeys (Crofts et al, 2001). Many studies incorporate two stimuli in each discrimination and have subjects perform serial reversals (A⁺/B⁻ → A⁻/B⁺ → A⁺/B⁻, etc.); some studies have reversals occurring within an experimental session, whereas others involve shifting across experimental sessions (with or without an intervening retention session). Owing to these differences in experimental procedures, it is frequently difficult to compare the results of the studies that investigated the effects of pharmacological treatments and lesions on the performance of visual discrimination and reversal.

We used three stimuli in our discriminations so that we could dissect perseverative from random errors in reversal. In a classic two-choice discrimination, an error is by definition perseverative; however, in a three-choice discrimination, we can separate the strictly perseverative errors from the type of errors that would occur during a random search strategy (equal numbers of perseverative and neutral errors). The usefulness of this approach in detecting stimulant drug-induced increases on perseverative responding in a discrimination reversal task has been demonstrated previously (Jentsch et al, 2002).

Raclopride impaired reversal but did not significantly increase strictly perseverative errors. Although this effect is almost certainly due to limited statistical power of our study, we cannot assert beyond a reasonable doubt that raclopride specifically impairs the ability to inhibit perseverative tendencies. Nevertheless, both learning a discrimination for novel stimuli and reversal learning require the ability to hold ‘in mind’ information about previously visited stimuli and to inhibit the return to unrewarded cues; and both require the ability to focus attention on the visual features of individual stimuli. The difference in drug effects in reversal vs new discrimination learning, therefore, is unlikely to be explained by alterations in working memory or attention. The chief differences between these two learning conditions include the need to inhibit the prepotent response to the previously rewarded stimulus and the ability to overcome the learned irrelevance of previously unrewarded cues.

For the purposes of acute behavioral pharmacological studies, reversal within a single experimental session is problematic because the drug is generally given before the initial acquisition phase. Using a between-sessions approach allows for the dissection the acquisition and reversal components, pharmacologically. However, the between-sessions approach depends upon the ability to demonstrate that the drug does not affect the retention of the previously learned discrimination (in principle, drugs could facilitate retention, leading to poorer reversal learning or, alternatively, impair retention, leading to ‘better’ reversal performance). In addition, testing for retention, before reversal, probably ‘primes’ the prepotent response in a manner that increases the difficulty associated with the shift.

In the current study, although raclopride dramatically impaired performance of reversal learning when there was a short retention session right before the start of the reversal session, it did not consistently impair performance of reversal learning when tested without a retention session. Similarly, Ridley et al (1981) showed that haloperidol produced a deficit of reversal learning in monkeys when they used a within-day visual discrimination and reversal task, which has a retention session right before the reversal session. In contrast, Smith et al (1999) reported no significant impairment of reversal learning in marmosets treated with either (−)sulpride (5–10 mg/kg, i.m.) or raclopride (0.006–0.05 mg/kg, i.m.); as this study used an experimental design somewhat similar to our own, the discrepancy between these two studies is difficult to reconcile (species differences not withstanding).

The difference in raclopride's effects on reversal performance, which appear to depend upon whether the switch immediately follows a retention session, is intriguing and certainly worthy of further study. There are several potential reasons for this. First, during retention, animals tend to make a large proportion of correct responses and obtain a fair number of rewards; therefore, it is possible that synaptic DA levels are different when reversal follows successful choices than when it occurs at the beginning of the session. Second, the retention session may engage certain striatal ensembles that code for particular approach responses, and the ‘priming’ of those ensembles that occurs during retention testing increases prepotency and exaggerates the ability to observe an impairment of reversal when response inhibition mechanisms are compromised by raclopride. Finally, there may be a difference in the memory system that holds the previously learned rule in the two conditions. When reversal occurs from the first trial of the reversal session, the previously learned rule is likely to be stored in and retrieved from a long-term memory mechanism, whereas because of within-session rewarded responding, it is more likely to be held in some sort of short-term or working memory system if reversal happens after the retention test. It is possible that D₂ antagonism does not modulate the former but does the latter.

Neuroanatomy of Reversal Learning

Studies in non-human primates have implicated a network of structures, including the orbitofrontal cortex, its ventral and medial striatal target zones, and the ascending monoaminergic systems, in the regulation of the ability to modify responses flexibly in reversal learning tasks (Butter, 1969; Iversen and Mishkin, 1970; Dias et al, 1996; Rolls et al, 1996; Clarke et al, 2004, 2007; Izquierdo et al, 2004). Butter (1969) first showed that damage to large sections of prefrontal regions could elicit perseveration in monkeys performing a discrimination reversal task. Later studies assigned difficulty with modifying behavior in response to changing reinforcement contingencies to the inferior regions of frontal lobe, namely the orbitofrontal cortex (Iversen and Mishkin, 1970; Rosenkilde, 1979; Dias et al, 1996; Rolls et al, 1996; Izquierdo et al, 2004). The ventrolateral prefrontal regions of the human brain, particularly on the right, are also strongly implicated in the ability to changes responses successfully (Cools et al, 2002; Aron et al, 2004).

In the PFC, the modulation of circuitry by the indoleamine neurotransmitter serotonin appears to be critical, as serotonin depletion strongly impairs reversal learning (Clarke et al, 2004). At least in the marmoset performing two-choice serial reversals on a touch screen (see below), dopaminergic mechanisms within PFC do not appear to play a major role in the ability to shift response (Clarke et al, 2007). Nevertheless, human, monkey, and rodent behavioral pharmacological studies have clearly implicated dopaminergic mechanisms in reversal learning (Ridley et al, 1981; Mehta et al, 2001; Kruzich and Grandy, 2004; Cools et al, 2006). One possible resolution of these findings is that dopaminergic mechanisms regulate reversal learning by actions in the basal ganglia (Frank and Claus, 2006). Few studies have used selective manipulations of striatal circuitry, and dopaminergic mechanisms, in particular, to evaluate local mechanisms in control of reversal learning. Collins et al (2000) found that partial insults to striatal DA did not specifically impair reversal learning. Nevertheless, lesion studies in rats demonstrated important contributions of striatal circuitry, and its dopaminergic innervation, to response reversal (Taghzouti et al, 1985; Ferry et al, 2000). More direct investigations of the contribution of local dopaminergic mechanisms within the striatum to reversal learning are clearly required.

D₂-Like Receptor Mechanisms in Executive Function

Although the current study and others demonstrated that genetic deletion or blockade of the D₂/D₃ receptors (Ridley et al, 1981; Kruzich and Grandy, 2004) impair the performance of reversal learning tasks, overstimulation of the D₂-like receptors can produce qualitatively similar results. The D₃-preferring agonist can degrade reversal learning in marmosets (Smith et al, 1999), and activation of the D₂-like receptors with the agonist bromocriptine specifically impairs reversal learning in healthy human volunteers (Mehta et al, 2001).

Why might activation or blockade of the D₂-like receptors produce qualitatively similar effects on reversal learning? It has recently been hypothesized that the D₂-like receptors, which exhibit a greater intra-synaptic localization, are affected by dramatic and rapid variations in DA efflux related to individual neuronal events (eg burst firing of DA cells; Durstewitz and Seamans, 2002; Bilder et al, 2004). Indeed, Durstewitz and Seamans (2002) hypothesized that the plasticity of forebrain networks underlies the ability to update the contents of working memory (ie to replace current cognitive representation with novel task-relevant information), and the ability to introduce flexibility into responses when the environment deviates from expectations or when current responses are recognized as erroneous (ie to switch from one behavioral program to another). If phasic, stimulus-evoked activations of dopaminergic neurons are required to update network operations, then the similar effects of D₂-like receptor agonists and antagonists should both be negative with respect to reversal learning. Agonists of the D₂-like receptors, by activating both the somatodendritic and nerve terminal D₂-like receptors (autoreceptors), should terminate impulse activity in dopaminergic neurons (Cooper et al, 2003). DA D₂-like receptor antagonists should increase this phasic activity but will do so in the context of antagonism of many of the postsynaptic targets. Therefore, the similar cognitive effects of the D₂-like receptor agonists and antagonists may depend upon a net disruption of phasic activity in the mesocorticolimbic synapse.

The role for DA D1-like receptors in behavioral flexibility and reversal learning is less clear. There are few D1 agonists or antagonists with sufficient selectivity for use in human subjects, and ours is the first study to examine the effects of a D1 antagonist on reversal learning in the non-human primate. We find that SCH23390 failed to impair performance; however, in mice, the D1-like receptor agonist SKF81297 was reported to impair reversal learning, using an automated touch-screen procedure (Izquierdo et al, 2006). In essence, it is possible that subphysiological activation of D1-like receptors by DA does not affect reversal learning, whereas supra-normal stimulation does. If DA D1 receptors are required to stabilize the maintenance of representations in cortico-striatal networks (Durstewitz and Seamans, 2002), overactivation of these receptors may lead to difficulty switching between response sets when reversal occurs. Further studies in monkeys of D1-like agonists and antagonists using behavioral procedures such as ours are clearly warranted.

Implications for Mental Disorders

The ability to shift ‘set’ or response is suboptimal in a number of psychiatric disorders, namely schizophrenia, drug addiction, and attention deficit/hyperactivity disorder (Crider, 1997; Itami and Uno, 2002; Fillmore and Rush, 2006). Indeed, we have previously emphasized the face validity for impairments in the ability to stop or change ongoing responses for key psychopathological aspects of drug dependence and abuse (Jentsch and Taylor, 1999). For example, the abuse of addictive stimulant drugs is associated with profound impairments in tasks requiring changing or stopping responses (Monterosso et al, 2005), and notably, these same disorders are associated with substantial downregulations of the D₂-like receptors in brain (Volkow et al, 1990, 1993, 2001). Our data provide evidence for a potential direct linkage between D₂-like receptor function and the ability to modify behavior flexibly in response to a changing environment and suggest that effecting a net increase in D₂-like receptor function in stimulant addicts may improve their ability to exert cognitive control over their own behavior in a manner that may immediately benefit their ability to quit. Notably, these enhancements will not necessarily be achieved with D₂-like receptor agonists, but rather, with agents that reintroduce normal physiological activity into the mesocorticolimbic DA neurons that regulate forebrain circuits and that are themselves substantially affected by stimulant abuse. Further studies of the afferent regulatory controls over dopaminergic neurons can therefore be highly beneficial when considering strategies for effecting substantial gains in stimulant addicts.

References

Arnsten AF, Cai JX, Murphy BL, Goldman-Rakic PS (1994). Dopamine D1 receptor mechanisms in the cognitive performance of young adult and aged monkeys. Psychopharmacology (Berlin) 116: 143–151.
Article CAS Google Scholar
Arnsten AF, Cai JX, Steere JC, Goldman-Rakic PS (1995). Dopamine D2 receptor mechanisms contribute to age-related cognitive decline: the effects of quinpirole on memory and motor performance in monkeys. J Neurosci 15: 3429–3439.
Article CAS PubMed PubMed Central Google Scholar
Aron AR, Robbins TW, Poldrack RA (2004). Inhibition and the right inferior frontal cortex. Trends Cogn Sci 8: 170–177.
Article PubMed Google Scholar
Bilder RM, Volavka J, Lachman HM, Grace AA (2004). The catechol-O-methyltransferase polymorphism: relations to the tonic-phasic dopamine hypothesis and neuropsychiatric phenotypes. Neuropsychopharmacology 29: 1943–1961.
Article CAS PubMed Google Scholar
Butter CM (1969). Perseveration in extinction and in discrimination reversal tasks following selective frontal ablations in Macaca mulatta. Physiol Behav 4: 163–171.
Article Google Scholar
Campbell UC, Rodefer JS, Carroll ME (1999). Effects of dopamine receptor antagonists (D1 and D2) on the demand for smoked cocaine base in rhesus monkeys. Psychopharmacology (Berlin) 144: 381–388.
Article CAS Google Scholar
Clarke HF, Dalley JW, Crofts HS, Robbins TW, Roberts AC (2004). Cognitive inflexibility after prefrontal serotonin depletion. Science 304: 878–880.
Article CAS PubMed Google Scholar
Clarke HF, Walker SC, Dalley JW, Robbins TW, Roberts AC (2007). Cognitive inflexibility after prefrontal serotonin depletion is behaviorally and neurochemically selective. Cereb Cortex 17: 18–27.
Article CAS PubMed Google Scholar
Collins P, Wilkinson LS, Everitt BJ, Robbins TW, Roberts AC (2000). The effect of dopamine depletion from the caudate nucleus of the common marmoset (Callithrix jacchus) on tests of prefrontal cognitive function. Behav Neurosci 114: 3–17.
Article CAS PubMed Google Scholar
Cools R, Altamirano L, D'Esposito M (2006). Reversal learning in Parkinson's disease depends on medication status and outcome valence. Neuropsychologia 44: 1663–1673.
Article PubMed Google Scholar
Cools R, Clark L, Owen AM, Robbins TW (2002). Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imaging. J Neurosci 22: 4563–4567.
Article CAS PubMed PubMed Central Google Scholar
Cooper JC, Bloom FE, Roth RH (2003). The Biochemical Basis of Neuropharmacology, 8th edn. Oxford University Press: Oxford.
Google Scholar
Crider A (1997). Perseveration in schizophrenia. Schizophr Bull 23: 63–74.
Article CAS PubMed Google Scholar
Crofts HS, Dalley JW, Collins P, Van Denderen JC, Everitt BJ, Robbins TW et al (2001). Differential effects of 6-OHDA lesions of the frontal cortex and caudate nucleus on the ability to acquire an attentional set. Cereb Cortex 11: 1015–1026.
Article CAS PubMed Google Scholar
Dias R, Robbins TW, Roberts AC (1996). Primate analogue of the Wisconsin Card Sorting Test: effects of excitotoxic lesions of the prefrontal cortex in the marmoset. Behav Neurosci 110: 872–886.
Article CAS PubMed Google Scholar
Durstewitz D, Seamans JK (2002). The computational role of dopamine D1 receptors in working memory. Neural Netw 15: 561–572.
Article PubMed Google Scholar
Everitt BJ, Robbins TW (2005). Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nat Neurosci 8: 1481–1489.
Article CAS PubMed Google Scholar
Eyny YS, Horvitz JC (2003). Opposing roles of D1 and D2 receptors in appetitive conditioning. J Neurosci 23: 1584–1587.
Article CAS PubMed PubMed Central Google Scholar
Ferry AT, Lu XC, Price JL (2000). Effects of excitotoxic lesions in the ventral striatopallidal–thalamocortical pathway on odor reversal learning: inability to extinguish an incorrect response. Exp Brain Res 131: 320–335.
Article CAS PubMed Google Scholar
Fillmore MT, Rush CR (2006). Polydrug abusers display impaired discrimination-reversal learning in a model of behavioural control. J Psychopharmacol 20: 24–32.
Article PubMed Google Scholar
Floresco SB, Magyar O, Ghods-Sharifi S, Vexelman C, Tse MT (2006). Multiple dopamine receptor subtypes in the medial prefrontal cortex of the rat regulate set-shifting. Neuropsychopharmacology 31: 297–309.
Article CAS PubMed Google Scholar
Frank MJ, Claus ED (2006). Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol Rev 113: 300–326.
Article PubMed Google Scholar
Granon S, Passetti F, Thomas KL, Dalley JW, Everitt BJ, Robbins TW (2000). Enhanced and impaired attentional performance after infusion of D1 dopaminergic receptor agents into rat prefrontal cortex. J Neurosci 20: 1208–1215.
Article CAS PubMed PubMed Central Google Scholar
Itami S, Uno H (2002). Orbitofrontal cortex dysfunction in attention-deficit hyperactivity disorder revealed by reversal and extinction tasks. Neuroreport 13: 2453–2457.
Article PubMed Google Scholar
Iversen SD, Mishkin M (1970). Perseverative interference in monkeys following selective lesions of the inferior prefrontal convexity. Exp Brain Res 11: 376–386.
Article CAS PubMed Google Scholar
Izquierdo A, Suda RK, Murray EA (2004). Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci 24: 7540–7548.
Article CAS PubMed PubMed Central Google Scholar
Izquierdo A, Wiedholz LM, Millstein RA, Yang RJ, Bussey TJ, Saksida LM et al (2006). Genetic and dopaminergic modulation of reversal learning in a touchscreen-based operant procedure for mice. Behav Brain Res 171: 181–188.
Article CAS PubMed Google Scholar
Jentsch JD, Olausson P, De La Garza II R, Taylor JR (2002). Impairments of reversal learning and response perseveration after repeated, intermittent cocaine administrations to monkeys. Neuropsychopharmacology 26: 183–190.
Article CAS PubMed Google Scholar
Jentsch JD, Redmond Jr DE, Elsworth JD, Taylor JR, Youngren KD, Roth RH (1997). Enduring cognitive deficits and cortical dopamine dysfunction in monkeys after long-term administration of phencyclidine. Science 277: 953–955.
Article CAS PubMed Google Scholar
Jentsch JD, Taylor JR (1999). Impulsivity resulting from frontostriatal dysfunction in drug abuse: implications for the control of behavior by reward-related stimuli. Psychopharmacology 146: 373–390.
Article CAS PubMed Google Scholar
Kebabian JW, Calne DB (1979). Multiple receptors for dopamine. Nature 277: 93–96.
Article CAS PubMed Google Scholar
Kruzich PJ, Grandy DK (2004). Dopamine D2 receptors mediate two-odor discrimination and reversal learning in C57BL/6 mice. BMC Neurosci 5: 12.
Article PubMed PubMed Central Google Scholar
Lyons DM, Lopez JM, Yang C, Schatzberg AF (2000). Stress-level cortisol treatment impairs inhibitory control of behavior in monkeys. J Neurosci 20: 7816–7821.
Article CAS PubMed PubMed Central Google Scholar
Mehta MA, Swainson R, Ogilvie AD, Sahakian J, Robbins TW (2001). Improved short-term spatial memory but impaired reversal learning following the dopamine D(2) agonist bromocriptine in human volunteers. Psychopharmacology 159: 10–20.
Article CAS PubMed Google Scholar
Monterosso JR, Aron AR, Cordova X, Xu J, London ED (2005). Deficits in response inhibition associated with chronic methamphetamine abuse. Drug Alcohol Depend 79: 273–277.
Article PubMed Google Scholar
Murphy BL, Arnsten AF, Goldman-Rakic PS, Roth RH (1996). Increased dopamine turnover in the prefrontal cortex impairs spatial working memory performance in rats and monkeys. Proc Natl Acad Sci USA 93: 1325–1329.
Article CAS PubMed PubMed Central Google Scholar
Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G et al (2002). Nucleus accumbens dopamine depletion impairs both acquisition and performance of appetitive Pavlovian approach behaviour: implications for mesoaccumbens dopamine function. Behav Brain Res 137: 149–163.
Article CAS PubMed Google Scholar
Ragozzino ME (2002). The effects of dopamine D(1) receptor blockade in the prelimbic-infralimbic areas on behavioral flexibility. Learn Mem 9: 18–28.
Article PubMed PubMed Central Google Scholar
Ridley RM, Haystead TA, Baker HF (1981). An analysis of visual object reversal learning in the marmoset after amphetamine and haloperidol. Pharmacol Biochem Behav 14: 345–351.
Article CAS PubMed Google Scholar
Robinson TE, Berridge KC (2003). Addiction. Annu Rev Psychol 54: 25–53.
Article PubMed Google Scholar
Rolls ET, Critchley HD, Mason R, Wakeman EA (1996). Orbitofrontal cortex neurons: role in olfactory and visual association learning. J Neurophysiol 75: 1970–1981.
Article CAS PubMed Google Scholar
Rosenkilde CE (1979). Functional heterogeneity of the prefrontal cortex in the monkey: a review. Behav Neural Biol 25: 301–345.
Article CAS PubMed Google Scholar
Sawaguchi T, Goldman-Rakic PS (1991). D1 dopamine receptors in prefrontal cortex: involvement in working memory. Science 251: 947–950.
Article CAS PubMed Google Scholar
Smith AG, Neill JC, Costall B (1999). The dopamine D3/D2 receptor agonist 7-OH-DPAT induces cognitive impairment in the marmoset. Pharmacol Biochem Behav 63: 201–211.
Article CAS PubMed Google Scholar
Taghzouti K, Louilot A, Herman JP, Le Moal M, Simon H (1985). Alternation behavior, spatial discrimination, and reversal disturbances following 6-hydroxydopamine lesions in the nucleus accumbens of the rat. Behav Neural Biol 44: 354–363.
Article CAS PubMed Google Scholar
Volkow ND, Chang L, Wang GJ, Fowler JS, Ding YS, Sedler M et al (2001). Low level of brain dopamine D2 receptors in methamphetamine abusers: association with metabolism in the orbitofrontal cortex. Am J Psychiatry 158: 2015–2021.
Article CAS PubMed Google Scholar
Volkow ND, Fowler JS, Wang GJ, Hitzemann R, Logan J, Schlyer DJ et al (1993). Decreased dopamine D2 receptor availability is associated with reduced frontal metabolism in cocaine abusers. Synapse 14: 169–177.
Article CAS PubMed Google Scholar
Volkow ND, Fowler JS, Wolf AP, Schlyer D, Shiue CY, Alpert R et al (1990). Effects of chronic cocaine abuse on postsynaptic dopamine receptors. Am J Psychiatry 147: 719–724.
Article CAS PubMed Google Scholar
Von Huben SN, Davis SA, Lay CC, Katner SN, Crean RD, Taffe MA (2006). Differential contributions of dopaminergic D(1)- and D (2)-like receptors to cognitive function in rhesus monkeys. Psychopharmacology (Berlin) 188: 586–596.
Article Google Scholar
Williams GV, Goldman-Rakic PS (1995). Modulation of memory fields by dopamine D1 receptors in prefrontal cortex. Nature 376: 572–575.
Article CAS PubMed Google Scholar
Williams SM, Goldman-Rakic PS (1993). Characterization of the dopaminergic innervation of the primate frontal cortex using a dopamine-specific antibody. Cereb Cortex 3: 199–222.
Article CAS PubMed Google Scholar
Williams SM, Goldman-Rakic PS (1998). Widespread origin of the primate mesofrontal dopamine system. Cereb Cortex 8: 321–345.
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by PHS Grants MH077248 (to JDJ) and DA020598 (to BL), and by ONDCP Grant DABT63-00-C-1003 (to EDL). We also gratefully acknowledge the support from the UCLA Experimental Center for Cognitive Phenomics (RR020750 to R Bilder).

Author information

Authors and Affiliations

Department of Psychiatry and Biobehavioral Sciences, University of California at Los Angeles, Los Angeles, CA, USA
Buyean Lee, Stephanie Groman & Edythe D London
Department of Molecular and Medical Pharmacology, University of California at Los Angeles, Los Angeles, CA, USA
Edythe D London
The Brain Research Institute, University of California at Los Angeles, Los Angeles, CA, USA
Edythe D London & James David Jentsch
Department of Psychology, University of California at Los Angeles, Los Angeles, CA, USA
James David Jentsch

Authors

Buyean Lee
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie Groman
View author publications
You can also search for this author in PubMed Google Scholar
Edythe D London
View author publications
You can also search for this author in PubMed Google Scholar
James David Jentsch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James David Jentsch.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, B., Groman, S., London, E. et al. Dopamine D₂/D₃ Receptors Play a Specific Role in the Reversal of a Learned Visual Discrimination in Monkeys. Neuropsychopharmacol 32, 2125–2134 (2007). https://doi.org/10.1038/sj.npp.1301337

Download citation

Received: 28 August 2006
Revised: 21 November 2006
Accepted: 28 December 2006
Published: 14 February 2007
Issue Date: October 2007
DOI: https://doi.org/10.1038/sj.npp.1301337

Keywords

This article is cited by