INTRODUCTION

A reduced capacity to withhold, modify, or sustain adaptive behavior, when demanded by environmental circumstances, is a critical sequela of a number of psychiatric disorders, most notably schizophrenia and substance abuse. With respect to the latter, the transition from substance use to compulsive substance abuse is marked by a reduction of the rewarding properties of the drug and an increase in drug-associated risks and negative consequences (Robinson and Berridge, 2003; Everitt and Robbins, 2005). Despite this progressive change in the contingent consequences of drug-taking behavior, many individuals are unable to inhibit or change their behavior (in this case, their drug-taking). It is therefore of great interest to understand the mechanisms by which organisms can flexibly and dynamically inhibit and/or change behavior in response to shifts in feedback; these capacities are frequently included within the rubric of ‘executive functions.’

In laboratory animals and human subjects, the ability to modify behavior adaptively in response to dynamically changing stimulus-reward contingencies can be studied using reversal learning tasks. In these procedures, animals learn a novel discrimination (eg cue A is contingently associated with reward, whereas cues B and C are not associated or neutral) before experiencing a reversal (in which reinforcement is now associated with either cue B or C, but not A). Efficient reversal learning requires the ability to recognize the shift in contingency, to inhibit a learned response to the previously rewarded cue, to overcome learned irrelevance for the previously negative cues, and to acquire a new associative structure to guide response. By comparing performance of a subject during reversal to performance under conditions in which it acquires a completely new discrimination (rather than a reversal), it is possible to isolate the aptitude to overcome the prepotent behavior (response inhibition) and the learned irrelevance.

Dopamine (DA) is a catecholamine neurotransmitter that has been substantially implicated in cognitive and executive functions. Prefrontal regions of the cerebral cortex receive a particularly dense dopaminergic input from the ventral mesencephalic DA cell nuclei (Williams and Goldman-Rakic, 1993, 1998), although this input is minor in comparison with the dopaminergic innervation of the basal ganglia. Using a reversal learning task, Ridley et al (1981) originally reported that systemic administration of haloperidol, a potent DA receptor antagonist, impaired the ability to change behavior in response to contingency shifts, suggesting that reductions in DA transmission impair one form of behavioral flexibility. Experimentally induced dysregulation of central DA has also been shown to impair response inhibition, as assessed using an object retrieval/detour task (Jentsch et al, 1997; Lyons et al, 2000). Taken together, indirect evidence may suggest that mesocortical DA systems do regulate adaptive responding in tasks, such as reversal learning procedures.

Although a general involvement of DA systems in reversal learning is known, the specific receptor mechanisms and anatomical target locations underlying these effects are unclear. DA receptors can be categorized as belonging to either the D1-like or the D2-like families. The D1-like DA receptors include the D1 and D5 subtypes; both of which are linked to activation of adenylyl cyclases. The D2-like DA receptors comprise the D2, D3, and D4 receptors; they all inhibit adenylyl cyclases (Kebabian and Calne, 1979). The results of Ridley et al (1981) support a strong involvement of the D2-like receptors, as haloperidol has higher affinity for this subtype than the others. In addition, Floresco et al (2006) demonstrated that administration of the D2/D3 subtype selective antagonist, eticlopride, potently impaired the ability of rats to change their behavior in response to a conditional change of rule in a set-shifting task. More direct evidence in support of these results derives from a recent study of D2 receptor gene knockout mice. Selective deletion of the D2 receptor gene produced a dramatic deficit of performance in reversal learning of odor discrimination, although basic associative mechanisms underlying this behavior were also impaired (Kruzich and Grandy, 2004). In contrast, the role of the D1-like receptors in behavioral flexibility is less clear. Although blockade of D1-like receptors in the prefrontal cortex (PFC) impaired performance of a set-shifting task, stimulation of the D1-like receptors in the PFC did not (Ragozzino, 2002; Floresco et al, 2006). However, it is simply unknown whether the D1-like receptors play a role in reversal learning, although the D1-like receptors are critically involved in other cognitive functions, such as working memory (Sawaguchi and Goldman-Rakic, 1991; Williams and Goldman-Rakic, 1995) and visual attention (Granon et al, 2000). Therefore, DA receptor subtypes appear to contribute in very different ways to the spectrum of cognitive and executive functions.

Based upon these pieces of evidence, we set out to determine the relative involvement of the D1-like and D2-like receptors in reversal learning performance in Old World monkeys. We further sought to examine whether any observed drug effects were specific for reversal learning or whether they might also be seen in a new learning situation. To accomplish these goals, we administered the D2/D3 specific receptor antagonist raclopride to monkeys performing a visual discrimination and reversal learning task. Cognitive and pharmacological specificity of the effects of raclopride on reversal learning was investigated in two ways. First, we tested whether blockade of the D2/D3 receptors produces an impairment that depends upon the need to overcome the learned response or upon a general set of associative learning mechanisms. We hypothesized that D2/D3 receptor antagonism specifically impairs response inhibition, and therefore, that drug effects would be evident under reversal, but not new learning, conditions. Second, we tested whether SCH 23390, the D1-like specific receptor antagonist, produces impairment similar to raclopride. We hypothesized that SCH 23390 would generally impair performance, affecting both reversal and new learning (Parkinson et al, 2002; Eyny and Horvitz, 2003).

MATERIALS AND METHODS

Subjects

Four adult male vervet monkeys (Chlorocebus aethiops sabaeus from the UCLA Vervet Research Colony), weighing 6.0–7.4 kg, were used in these studies. The subjects were 7–10 years of age at the time of testing. All animals were housed individually in a climate-controlled vivarium, where they had unlimited access to water and received a nutritionally balanced diet of monkey chow (Teklad®, Madison, WI) supplemented with fresh fruit. The monkeys received their full daily allotment of food (which was never reduced in order to support behavioral performance) immediately after morning testing (1/2 portion) and again at 1600–1700 hours (1/2 portion).

All animals were maintained in accordance with the ‘Guide for the Care and Use of Laboratory Animals’ of the Institute of Laboratory Animal Resources, National Research Council, Department of Health, Education and Welfare Publication No. (NIH) 85–23, revised 1996. Research protocols were approved by the UCLA Chancellor's Animal Research Committee.

General Methods

A modified Wisconsin General Test Apparatus (WGTA) consisted of an opaque screen (that could be raised and lowered) that separated the monkey from a tray fitted with three opaque square food boxes with hinged opaque lids. The tops and insides of the food boxes were fitted with distinctive, colored pictures (clip art from the Microsoft Office® library), and the monkeys were able to open the lids of food boxes easily to retrieve food rewards (bits of apple, grape, or peanuts) hidden within.

The monkeys, which had previously been trained to move voluntarily from their cages to a transport cart, were moved to a sound-attenuated testing room. In the testing room, the transport cart was aligned to the WGTA, so that monkeys could easily access the food boxes on the tray when the opaque screen was raised. The monkeys were presented with trials on which a screen was raised to reveal three food boxes with visually distinctive cues (pictures). They were allowed to open only one box lid per trial. Each trial lasted either until the animal open a lid or 2 min passed, whichever came first. The inter-trial interval was approximately 15 s. For each set of three visual cues, the picture that was chosen to be the ‘positive’ stimulus was varied across the four subjects. The position of the food boxes varied pseudo-randomly across all trials. Each subject was tested between 0800 and 1030 hours on each day.

Raclopride (D2/D3) Experiments

Reversal learning without retention session

In an initial experiment, all monkeys were trained to acquire and reverse novel visual discriminations in a between-session design. For example, on day 1 (Mondays and Thursdays), the monkeys were presented with three visual cues that they never saw before and were trained to learn that one cue was associated with reward, whereas the other two were not (discrimination acquisition session). Once again, the animals were required to select one cue per trial and to determine, solely by trial and error, which cue was associated with reward. The acquisition session continued until monkeys reached a predetermined performance criterion (at least seven correct responses in 10 consecutive trials) or had completed 80 trials. The following day (Tuesdays and Fridays), monkeys were tested in a condition in which the reward was associated with one of the two previously non-rewarded cues and the previously positive stimulus was not associated with reward (reversal session). The reversal session continued until monkeys reached criterion (seven selections of new positive stimulus in 10 consecutive trials) or up to 80 trials. The number of trials to reach criterion, the number of selections of negative stimuli (reversal errors), and the number of selections of the previously rewarded stimulus (perseverative errors) committed during the reversal session were the dependent measures.

All subjects were treated via intra-muscular injection with raclopride-L-tartrate (Sigma, St Louis, MO; 0.001–0.03 mg/kg) or its vehicle 20 min before testing. The dose range for raclopride was chosen because it spans doses that have been shown to (1) impair working memory and paired associates learning in rhesus monkeys (Von Huben et al, 2006); (2) block the cognitive impairments caused by the D2 agonist quinpirole in rhesus monkeys (Arnsten et al, 1995); and (3) attenuate cocaine self-administration at 0.03 mg/kg in monkeys (Campbell et al, 1999). The vehicle was sterile isotonic saline. Drug weights were calculated as the salt. The order of the drug treatment conditions was balanced according to a Latin Squares design across the subjects.

Reversal learning with a retention session

In follow-up experiments, we included a short retention session before reversal on the appropriate testing days (Tuesdays and Fridays). This modification accomplished two goals. First, because the difficulty in reversal may be related to the strength of the ‘memory’ of the previously learned discrimination, we wanted to ensure that any drug effects on reversal performance were not affected by changes in retention of the learned discrimination. Second, the retention session probably serves as a reminder of the previous discrimination and therefore increases prepotency. On Mondays and Thursdays, monkeys were trained on a novel discrimination as described above (acquisition session). On the following days, the monkeys were tested under the same stimulus-reward contingencies as encountered in the acquisition session (the previously positive stimulus was still rewarded). The retention session included a sufficient number of trials until the subjects achieved four correct responses out of five consecutive trials. Once monkeys reached criterion, the reversal was initiated, as described above. Testing was continued until the animals made seven correct choices in 10 consecutive trials. One week before raclopride tests, all monkeys participated in two reversal sessions with retention to expose them to this new condition.

The monkeys were treated via intra-muscular injections with raclopride (0.03 mg/kg) or its vehicle 20 min before retention/reversal sessions. The order of the treatment conditions was balanced across all subjects. For this follow-up study, only the highest dose of raclopride was tested.

Reversal vs new learning

Because reversal learning requires both inhibition of a prepotent response and the learning of a new stimulus-reward contingency, drug-induced impairments of reversal could reflect a basic defect in learning. We, therefore, sought to test the selectivity of the behavioral effects of raclopride by determining whether the drug affects the ability to learn a completely new stimulus-reward association. Monkeys were trained on a visual discrimination as described above (acquisition session) on Mondays and Thursdays. On the following days, they were tested for retention (precisely as described above); then, the monkeys were tested for either reversal of the previously learned discrimination or for acquisition of a completely new discrimination involving three pictures that they had never seen before.

Monkeys were treated i.m. with either raclopride (0.03 mg/kg) or its vehicle 20 min before the new learning or reversal sessions. With two drug conditions (saline vs raclopride) and two behavioral conditions (new learning vs reversal), we delivered four experimental sessions, the order of which was balanced across all subjects. Because raclopride impaired reversal learning performance with a retention session at the highest dose only (0.03 mg/kg), only this dose of raclopride was tested in the reversal vs new learning conditions.

SCH 23390 (D1/D5) Experiments

After finishing all tests with raclopride, the same subjects were tested to determine whether a D1/D5 receptor antagonist, with some affinity for 5-HT2 receptors, SCH 23390 hydrochloride (St Louis, MO; 0.001–0.03 mg/kg) affects performance of reversal learning. The dose range of SCH23390 was chosen because it spans doses that attenuate FG-7142-induced cognitive deficits (Murphy et al, 1996) and impair delayed response performance in monkeys (Arnsten et al, 1994). To accomplish this goal, we utilized the testing conditions wherein retention was followed by reversal. Monkeys were treated via intra-muscular injection with SCH 23390 20 min before the start of retention session. The order of the drug treatment conditions were balanced across all the subjects, using a Latin Square design. We also tested whether SCH 23390 (0.03 mg/kg) affected the ability to learn new a completely new stimulus-reward association using the same procedure described in the Reversal vs new learning section of raclopride experiments. Because SCH 23390 did not impair reversal learning performance at any dose, the highest dose of SCH 23390 was selected to be tested in the reversal vs new learning conditions.

Data Analyses

The effects of drug treatment on performance of reversal learning were analyzed with paired t-test or one-way analysis of variance (ANOVA) followed by Dunnett's test. The effects of raclopride (0.03 mg/kg) and SCH 23390 (0.03 mg/kg) on reversal learning and new learning were initially analyzed with two-way ANOVA followed by a priori paired one-tailed t-tests to probe any significant main effects or interactions.

RESULTS

In reversal learning without a retention session, the monkeys acquired the association between stimulus and reward within 15.6±2.6 trials (mean±SE). On the subsequent day of testing for reversal, the D2/D3 antagonist raclopride (0.001–0.01 mg/kg, i.m.) did not significantly affect either the number of trials required to reach the preset performance criterion (F(3,9)=1.84, P=0.22), the number of reversal errors committed (F(3,9)=1.47, P=0.29), or the number of perseverative errors (F(3,9)=1.77, P=0.22) (Figure 1). In a follow-up experiment, the effects of saline were compared with those of a higher dose of raclopride, 0.03 mg/kg; again, raclopride did not significantly affect either the number of trials to criterion (t(3)=−0.20, P=0.85), the total number of reversal errors (t(3)=−0.78, P=0.49), or the number of perseverative errors (t(3)=0.09, P=0.94) (Figure 1). Over the range of raclopride doses studied, the monkeys did not omit any trials, indicating that raclopride did not cause nonspecific behavioral disruption or suppression.

Figure 1
figure 1

The effects of raclopride (0.001–0.03 mg/kg, i.m.) on the performance of reversal learning without a retention session. Data represent means±SEM (n=4).

In reversal learning with a retention session, the monkeys were exposed to the same association between stimulus and reward used during the acquisition session before experiencing the reversal. Raclopride (0.03 mg/kg) had no measured effects on retention of the previously learned discrimination, as indicated by a lack of effect on the number of trials to reach criterion (t(3)=−0.29, P=0.79) or the number of retention errors (t(3)=−0.26, P=0.84) (data not shown). When the monkeys were treated with raclopride (0.03 mg/kg), they did exhibit a significant impairment of performance in reversal learning. Specifically, raclopride significantly increased the number of reversal errors (t(3)=−3.21, P<0.05) during reversal sessions (Figure 2). Although the number of trials required to reach the preset performance criterion and the number of perseverative errors showed a tendency to be increased by raclopride treatment, these effects did not reach statistical significance (Figure 2).

Figure 2
figure 2

The effects of raclopride (0.03 mg/kg, i.m.) on the performance of reversal learning with a retention session. Symbols indicate data for each of the four individual monkeys. *Significantly different from vehicle treatment, according to t-test: P0.05.

Because of the possibility that raclopride might impair the associative mechanisms required during reversal (instead of only affecting behavioral flexibility), we tested whether raclopride would impair performance when the monkeys simply encountered a new set of cues about which they had to learn (ie new learning). Two-by-two ANOVA revealed main effects of drug (F(1,3)=10.14, P<0.05) and type of learning (new vs reversal, F(1,3)=13.96, P<0.05) for trials to criteria. For the number of errors, there were main effects of both drug (F(1,3)=12.79, P<0.05) and type of learning (F(1,3)=18.36, P<0.05). Separate one-tailed paired t-test confirmed that raclopride (0.03 mg/kg) impaired the performance of learning during reversal sessions; it significantly increased the number of trials to reach criterion, the number of reversal errors, and the number of perseverative errors (P<0.05; Figure 3). In contrast, raclopride had no measured effects on the ability of monkeys to learn a new stimulus-reward association, as indexed by the number of trials to criterion (t(3)=0.27, P=0.80) or the number of committed errors (t(3)=0.27, P=0.80).

Figure 3
figure 3

The effects of raclopride (0.03 mg/kg, i.m.) on the performance of reversal and new learning with a retention session. Symbols indicate data for each of the four individual monkeys. *Significantly different from vehicle treatment, according to t-test: P0.05.

The D1/D5 antagonist SCH 23390 (0.001–0.03 mg/kg, i.m.) did not significantly modulate performance of reversal learning with a retention session at any of the doses tested; there were no main effects of drug treatment in the number of trials to reach criterion (F(3,9)=0.82, P=0.51), the total number of reversal errors (F(3,9)=0.66, P=0.60), or the number of perseverative errors (F(3,9)=0.63, P=0.61) (Figure 4). We had planned to use a higher (0.1 mg/kg) dose of SCH 23390, but several of the monkeys were unable to perform at this dose because of bradykinesia and sedation.

Figure 4
figure 4

The effects of SCH 23390 (0.001–0.03 mg/kg, i.m.) on the performance of reversal learning with a retention session. Data represent means±SEM (n=4).

The 0.03 mg/kg dose of SCH 23390 was also evaluated for its ability to impair the acquisition of a new stimulus-reward association—when a novel set of pictures was used. Two-way repeated measures ANOVA indicated significant main effects of type of learning in the number of trials to reach criterion (F(1,3)=28.17, P=0.01) and the number of errors (F(1,3)=26.44, P=0.01). However, there were no main effects of drug on the number of trials to reach criterion (F(1,3)=3.12, P=0.18) or on the number of errors committed (F(1,3)=2.61, P=0.21) (Figure 5).

Figure 5
figure 5

The effects of SCH 23390 (0.03 mg/kg, i.m.) on the performance of reversal and new learning with a retention session. Symbols indicate data for each of the four individual monkeys.

DISCUSSION

In this study, blockade of the D2/D3 receptors, but not the D1/D5 receptors, increased the number of errors and the number of trials required to reach criterion under reversal conditions. The effect of raclopride on the acquisition of reversal was not due to a generalized impairment of the ability to learn associative relationships between stimuli and rewards, because raclopride did not impair the ability of the monkeys to learn a discrimination using novel stimuli. Our findings indicate that the D2/D3 receptors play a critical role in the ability of monkeys to shift behavior according to changing relationships between stimuli and reward. Notably, although SCH 23390, a D1/D5 receptor antagonist, produced impairment of spatial delayed response performance (a measure of working memory) in a prior study (Arnsten et al, 1994), this drug was without clear effect on reversal learning. These data suggest that there is a double-dissociation between D1-like and D2-like mechanisms in the regulation of working memory and executive-like functions of the frontostriatal system.

Behavioral Considerations

Visual discrimination and reversal learning has been used to measure the capacity to inhibit prepotent responses and to overcome previously learned irrelevance in non-human primates and humans. It is important to note that there are many factors that can directly modify the effects of pharmacological and lesion treatments on reversal learning. For example, differences in the apparatus (touch screen vs WGTA) and the discriminanda (objects vs pictures) have the potential to affect the sensitivity of reversal learning performance to lesions of frontal regions in monkeys (Crofts et al, 2001). Many studies incorporate two stimuli in each discrimination and have subjects perform serial reversals (A+/B → A/B+ → A+/B, etc.); some studies have reversals occurring within an experimental session, whereas others involve shifting across experimental sessions (with or without an intervening retention session). Owing to these differences in experimental procedures, it is frequently difficult to compare the results of the studies that investigated the effects of pharmacological treatments and lesions on the performance of visual discrimination and reversal.

We used three stimuli in our discriminations so that we could dissect perseverative from random errors in reversal. In a classic two-choice discrimination, an error is by definition perseverative; however, in a three-choice discrimination, we can separate the strictly perseverative errors from the type of errors that would occur during a random search strategy (equal numbers of perseverative and neutral errors). The usefulness of this approach in detecting stimulant drug-induced increases on perseverative responding in a discrimination reversal task has been demonstrated previously (Jentsch et al, 2002).

Raclopride impaired reversal but did not significantly increase strictly perseverative errors. Although this effect is almost certainly due to limited statistical power of our study, we cannot assert beyond a reasonable doubt that raclopride specifically impairs the ability to inhibit perseverative tendencies. Nevertheless, both learning a discrimination for novel stimuli and reversal learning require the ability to hold ‘in mind’ information about previously visited stimuli and to inhibit the return to unrewarded cues; and both require the ability to focus attention on the visual features of individual stimuli. The difference in drug effects in reversal vs new discrimination learning, therefore, is unlikely to be explained by alterations in working memory or attention. The chief differences between these two learning conditions include the need to inhibit the prepotent response to the previously rewarded stimulus and the ability to overcome the learned irrelevance of previously unrewarded cues.

For the purposes of acute behavioral pharmacological studies, reversal within a single experimental session is problematic because the drug is generally given before the initial acquisition phase. Using a between-sessions approach allows for the dissection the acquisition and reversal components, pharmacologically. However, the between-sessions approach depends upon the ability to demonstrate that the drug does not affect the retention of the previously learned discrimination (in principle, drugs could facilitate retention, leading to poorer reversal learning or, alternatively, impair retention, leading to ‘better’ reversal performance). In addition, testing for retention, before reversal, probably ‘primes’ the prepotent response in a manner that increases the difficulty associated with the shift.

In the current study, although raclopride dramatically impaired performance of reversal learning when there was a short retention session right before the start of the reversal session, it did not consistently impair performance of reversal learning when tested without a retention session. Similarly, Ridley et al (1981) showed that haloperidol produced a deficit of reversal learning in monkeys when they used a within-day visual discrimination and reversal task, which has a retention session right before the reversal session. In contrast, Smith et al (1999) reported no significant impairment of reversal learning in marmosets treated with either (−)sulpride (5–10 mg/kg, i.m.) or raclopride (0.006–0.05 mg/kg, i.m.); as this study used an experimental design somewhat similar to our own, the discrepancy between these two studies is difficult to reconcile (species differences not withstanding).

The difference in raclopride's effects on reversal performance, which appear to depend upon whether the switch immediately follows a retention session, is intriguing and certainly worthy of further study. There are several potential reasons for this. First, during retention, animals tend to make a large proportion of correct responses and obtain a fair number of rewards; therefore, it is possible that synaptic DA levels are different when reversal follows successful choices than when it occurs at the beginning of the session. Second, the retention session may engage certain striatal ensembles that code for particular approach responses, and the ‘priming’ of those ensembles that occurs during retention testing increases prepotency and exaggerates the ability to observe an impairment of reversal when response inhibition mechanisms are compromised by raclopride. Finally, there may be a difference in the memory system that holds the previously learned rule in the two conditions. When reversal occurs from the first trial of the reversal session, the previously learned rule is likely to be stored in and retrieved from a long-term memory mechanism, whereas because of within-session rewarded responding, it is more likely to be held in some sort of short-term or working memory system if reversal happens after the retention test. It is possible that D2 antagonism does not modulate the former but does the latter.

Neuroanatomy of Reversal Learning

Studies in non-human primates have implicated a network of structures, including the orbitofrontal cortex, its ventral and medial striatal target zones, and the ascending monoaminergic systems, in the regulation of the ability to modify responses flexibly in reversal learning tasks (Butter, 1969; Iversen and Mishkin, 1970; Dias et al, 1996; Rolls et al, 1996; Clarke et al, 2004, 2007; Izquierdo et al, 2004). Butter (1969) first showed that damage to large sections of prefrontal regions could elicit perseveration in monkeys performing a discrimination reversal task. Later studies assigned difficulty with modifying behavior in response to changing reinforcement contingencies to the inferior regions of frontal lobe, namely the orbitofrontal cortex (Iversen and Mishkin, 1970; Rosenkilde, 1979; Dias et al, 1996; Rolls et al, 1996; Izquierdo et al, 2004). The ventrolateral prefrontal regions of the human brain, particularly on the right, are also strongly implicated in the ability to changes responses successfully (Cools et al, 2002; Aron et al, 2004).

In the PFC, the modulation of circuitry by the indoleamine neurotransmitter serotonin appears to be critical, as serotonin depletion strongly impairs reversal learning (Clarke et al, 2004). At least in the marmoset performing two-choice serial reversals on a touch screen (see below), dopaminergic mechanisms within PFC do not appear to play a major role in the ability to shift response (Clarke et al, 2007). Nevertheless, human, monkey, and rodent behavioral pharmacological studies have clearly implicated dopaminergic mechanisms in reversal learning (Ridley et al, 1981; Mehta et al, 2001; Kruzich and Grandy, 2004; Cools et al, 2006). One possible resolution of these findings is that dopaminergic mechanisms regulate reversal learning by actions in the basal ganglia (Frank and Claus, 2006). Few studies have used selective manipulations of striatal circuitry, and dopaminergic mechanisms, in particular, to evaluate local mechanisms in control of reversal learning. Collins et al (2000) found that partial insults to striatal DA did not specifically impair reversal learning. Nevertheless, lesion studies in rats demonstrated important contributions of striatal circuitry, and its dopaminergic innervation, to response reversal (Taghzouti et al, 1985; Ferry et al, 2000). More direct investigations of the contribution of local dopaminergic mechanisms within the striatum to reversal learning are clearly required.

D2-Like Receptor Mechanisms in Executive Function

Although the current study and others demonstrated that genetic deletion or blockade of the D2/D3 receptors (Ridley et al, 1981; Kruzich and Grandy, 2004) impair the performance of reversal learning tasks, overstimulation of the D2-like receptors can produce qualitatively similar results. The D3-preferring agonist can degrade reversal learning in marmosets (Smith et al, 1999), and activation of the D2-like receptors with the agonist bromocriptine specifically impairs reversal learning in healthy human volunteers (Mehta et al, 2001).

Why might activation or blockade of the D2-like receptors produce qualitatively similar effects on reversal learning? It has recently been hypothesized that the D2-like receptors, which exhibit a greater intra-synaptic localization, are affected by dramatic and rapid variations in DA efflux related to individual neuronal events (eg burst firing of DA cells; Durstewitz and Seamans, 2002; Bilder et al, 2004). Indeed, Durstewitz and Seamans (2002) hypothesized that the plasticity of forebrain networks underlies the ability to update the contents of working memory (ie to replace current cognitive representation with novel task-relevant information), and the ability to introduce flexibility into responses when the environment deviates from expectations or when current responses are recognized as erroneous (ie to switch from one behavioral program to another). If phasic, stimulus-evoked activations of dopaminergic neurons are required to update network operations, then the similar effects of D2-like receptor agonists and antagonists should both be negative with respect to reversal learning. Agonists of the D2-like receptors, by activating both the somatodendritic and nerve terminal D2-like receptors (autoreceptors), should terminate impulse activity in dopaminergic neurons (Cooper et al, 2003). DA D2-like receptor antagonists should increase this phasic activity but will do so in the context of antagonism of many of the postsynaptic targets. Therefore, the similar cognitive effects of the D2-like receptor agonists and antagonists may depend upon a net disruption of phasic activity in the mesocorticolimbic synapse.

The role for DA D1-like receptors in behavioral flexibility and reversal learning is less clear. There are few D1 agonists or antagonists with sufficient selectivity for use in human subjects, and ours is the first study to examine the effects of a D1 antagonist on reversal learning in the non-human primate. We find that SCH23390 failed to impair performance; however, in mice, the D1-like receptor agonist SKF81297 was reported to impair reversal learning, using an automated touch-screen procedure (Izquierdo et al, 2006). In essence, it is possible that subphysiological activation of D1-like receptors by DA does not affect reversal learning, whereas supra-normal stimulation does. If DA D1 receptors are required to stabilize the maintenance of representations in cortico-striatal networks (Durstewitz and Seamans, 2002), overactivation of these receptors may lead to difficulty switching between response sets when reversal occurs. Further studies in monkeys of D1-like agonists and antagonists using behavioral procedures such as ours are clearly warranted.

Implications for Mental Disorders

The ability to shift ‘set’ or response is suboptimal in a number of psychiatric disorders, namely schizophrenia, drug addiction, and attention deficit/hyperactivity disorder (Crider, 1997; Itami and Uno, 2002; Fillmore and Rush, 2006). Indeed, we have previously emphasized the face validity for impairments in the ability to stop or change ongoing responses for key psychopathological aspects of drug dependence and abuse (Jentsch and Taylor, 1999). For example, the abuse of addictive stimulant drugs is associated with profound impairments in tasks requiring changing or stopping responses (Monterosso et al, 2005), and notably, these same disorders are associated with substantial downregulations of the D2-like receptors in brain (Volkow et al, 1990, 1993, 2001). Our data provide evidence for a potential direct linkage between D2-like receptor function and the ability to modify behavior flexibly in response to a changing environment and suggest that effecting a net increase in D2-like receptor function in stimulant addicts may improve their ability to exert cognitive control over their own behavior in a manner that may immediately benefit their ability to quit. Notably, these enhancements will not necessarily be achieved with D2-like receptor agonists, but rather, with agents that reintroduce normal physiological activity into the mesocorticolimbic DA neurons that regulate forebrain circuits and that are themselves substantially affected by stimulant abuse. Further studies of the afferent regulatory controls over dopaminergic neurons can therefore be highly beneficial when considering strategies for effecting substantial gains in stimulant addicts.