INTRODUCTION

Adolescence is a key period for the onset of many psychiatric diseases, such as schizophrenia, substance abuse, or affective disorders (Paus et al, 2008), which all involve impairments in decision-making abilities (Paulus, 2007). This is a period of major maturational processes (Andersen, 2003; Naneix et al, 2012; Somerville and Casey, 2010; Spear, 2000), which might provide a window of vulnerability to pathological development. Furthermore, adolescents become more frequently exposed to drugs and pharmacological rewards, which could have long-term consequences on cognitive functioning (Adriani and Laviola, 2004; Andersen, 2003; Crews et al, 2007).

The dopaminergic (DA) system has a central role in reward-related learning and decision-making processes (Bromberg-Martin et al, 2010; Montague et al, 2004; Schultz, 2000) and is the main target of most natural and pharmacological rewards (Koob, 1992; Wise, 2004). In adults, prolonged stimulation of the DA system induces long-lasting neurobiological changes and deficits in reward-related learning (Belin and Everitt, 2008; Hyman et al, 2006). Furthermore, DA pathways present a late maturation ending between adolescence and adult stage (Andersen et al, 2000; Kalsbeek et al, 1988; McCutcheon et al, 2012; Naneix et al, 2012; O’Donnell, 2010). However, the impact of prolonged stimulation of the developing DA system on decision-making and reward-related processes remains to be elucidated.

DA transmission is mainly regulated through D2 autoreceptors (Usiello et al, 2000), which also regulate DA fibers growth and synaptogenesis (Fasano et al, 2008; Parish et al, 2002). In this study, we therefore investigated the consequences of chronic stimulation of the DA system throughout adolescence, using the D2 receptor agonist quinpirole, on DA development and basic cognitive processes underlying decision making. We specifically focused on the control of goal-directed instrumental actions, which depends on the DA system (Balleine and O'Doherty, 2010; Hitchcott et al, 2007; Lex and Hauber, 2010c; Naneix et al, 2009; Nelson and Killcross, 2006) and emerges between adolescence and adult age (Naneix et al, 2012).

MATERIALS AND METHODS

Subjects

Litters of Long Evans rats (Center d’Elevage Janvier) were received between postnatal days 3 (P3) to 7 (P7) and culled to six pups per dam. Only the male offspring was used. Pups were housed with their mother in a temperature- and humidity-controlled room, and maintained under a 12- h light–dark cycle (light on at 0700 hours). The experiments took place during the light phase of the cycle. Rats weaned at P21 were housed in pairs. Two rats per litter were randomly assigned to each experimental group. Before the behavioral experiments, animals were given ad libitum access to food and water and were handled every day. All experiments were conducted in agreement with the French (Directive 87-148, Ministère de l’Agriculture et de la Pêche) and international (Directive 86-609, 24 November 1986, European Community) legislation.

Drug Exposure

Rats received intraperitoneal injections of quinpirole hydrochloride (QUIN, Sigma-Aldrich) at the dose of 0.1 mg/kg (QUIN 0.1 group, n=17) or 0.5 mg/kg (QUIN 0.5 group, n=15) or equivalent volume of sterile water vehicle (control group, n=14) every 2 days from weaning (P21) and throughout adolescence until adult age (P70). Rats were returned to their home cages immediately after each injection and their weight was controlled throughout the duration of the treatment.

Behavioral Procedures

Instrumental learning

Throughout the experiments, rats were maintained at ≈90% of their original weight. Rats were trained as adults from P72 to P91 using a set of eight conditioning chambers (40 cm wide × 30 cm deep × 35 cm high, Imetronic, Pessac, France) located inside sound and light-attenuating wooden chambers (Naneix et al, 2009, 2012). Initially, all rats were trained for 1 day to collect rewards in two different magazines (left: grain-based pellets 45 mg, F0165 Bio-Serv; right: 0.1 ml of 20% sucrose solution) during four 30-min magazine training sessions (two sessions for each reward, 1-h apart). Rewards were delivered on a random time 60-s schedule. During the next 6 days of instrumental training, all rats were trained to press levers to obtain these rewards during four 30-min instrumental training sessions each day (two sessions for each lever–reward association, in alternating order, 1-h apart). The cage was illuminated and the lever inserted during the duration of the whole session. The rats were first trained under a continuous reinforcement, fixed ratio schedule FR-1 for 1 day (ie, each lever press was rewarded) until they earned 30 rewards for each outcome. Animals were then shifted to a random ratio schedule 5 for 1 day (RR-5, i.e., on average one reward every five lever presses), to a random ratio 10 schedule for 1 day (RR-10) and then to a random ratio 20 for 2 days (RR-20).

Contingency degradation

Two 20-min sessions were given each day, one for each lever, during 8 days. During these sessions, as in the RR-20 schedule, animals had access to one lever and the probability to obtain the reward associated with this lever was 0.05 for each press. In addition, one reward was also delivered non-contingently. Non-contingent rewards were delivered with the probability of 0.05 for each second without a lever press. During one of the two daily sessions, non-contingent rewards were identical to those available with the lever press. This action–outcome relationship was degraded (degraded condition). In the other session, non-contingent rewards were different from those associated with the instrumental response. Thus, the other action–outcome relationship was preserved (non-degraded condition). For half of the rats, the action–pellet contingency was degraded, and for the other half the action–sucrose contingency was degraded. After 8 days of contingency degradation, responses were tested in a 5-min choice test with both levers present and no reward delivered.

Outcome devaluation by sensory specific satiety

Following the contingency degradation procedure, rats were retrained during 1 day under a RR-20 schedule (one session for each reward, 1-h apart). The day after, rats were given free access to one of the two rewards in their home cage for 60 min (devalued reward, half the rats receiving food pellet and half receiving sucrose solution). Immediately after the prefeeding treatment, rats were placed in their operant chambers for a 5-min choice test. During the test, the cage was illuminated and both levers were inserted but no reward was delivered. Following this test, rats were placed into individual consumption cages and allowed access to 10 g of each reward successively for 15 min (the order of the prefed and the non-prefed reward being counterbalanced across animals). Their overall consumption was measured to evaluate the effectiveness of prefeeding procedure. The following day, animals were retrained under a RR-20 schedule for the two rewards. A second test was conducted the day after, identical to the first one, except rats were prefed with the alternative reward.

Locomotor activity

Each rat was placed in an individual activity chamber (23 × 36 × 19 cm, Imetronic) for 1 h and the locomotor activity in the novel environment was recorded using two grids of photo beam sensors (3 × 37 × 3 cm) located 3 and 9 cm above the chamber floor.

Immunohistochemistry

Rats were killed with an overdose of pentobarbital sodium (Ceva Santé Animale) and perfused transcardially with 0.9% NaCl solution, followed by 4% PFA solution in 0.1 M PB. The brains were post-fixed overnight in 4% PFA and transferred to a PB 0.1 M/ 30% sucrose solution for 48 h at 4 °C. Serial coronal sections (50 μm thick) were cut on a freezing microtome (Leica SM 2400). Free floating sections were incubated with primary mouse monoclonal antibody (anti-tyrosine hydroxylase 1/1000 or anti-dopamine-β-hydroxylase 1/10 000 in PBST 0.3% and normal goat serum 2%, Millipore) for 48 h at 4 °C. Sections were then incubated with biotinylated goat anti-mouse (1/1000 in PBST 0.3%, Jackson ImmunoResearch) for 90 min at room temperature. They were then incubated with avidin–biotin–peroxydase complex (1/200 in PBS, Vector Laboratories) for 2 h at room temperature. The staining was revealed in a solution of diaminobenzidine (DAB 0.02%, Sigma-Aldrich) and hydrogen peroxide (0.07%).

Evaluation of Fiber Density

Labeled sections were scanned using a NanoZoomer (Hamamatsu Photonics) with 20 × lens. Digital microphotographs of region of interest (ROI) in each hemisphere were examined with a 5 × virtual lens. Each ROI was outlined according to the Paxinos and Watson atlas (Paxinos and Watson, 1998): ACC (anterior cingulated cortex), PLC (prelimbic cortex), ILC (infralimbic cortex), DMS (dorsomedial striatum), DLS (dorsolateral striatum), and NAc (nucleus accumbens). Quantification was performed using an automated method developed in the laboratory with ImageJ software (Naneix et al, 2009). Briefly, the digitized microphotograph was smoothed with a Gaussian filter (diameter 20 pixels) and subtracted from the original picture to isolate high spatial frequencies. The picture was then subjected to a fixed threshold to extract stained elements. The relative volume occupied by fibers was estimated by the proportion of detected pixels in the ROI then normalized with respect to the vehicle group.

Evaluation of DA Cells Density

Unbiased stereological estimates of TH immunoreactive cell density were obtained by the optical fractionator method on every twelfth section along the rostrocaudal axis of the midbrain. Ventral tegmental area (VTA) and substantia nigra (SN) pars compacta were outlined according to the Paxinos and Watson atlas under a 2.5 × lens using Mercator software (Explora Nova). TH immunoreactive cells were manually counted under × 50 magnification (immersion lens) in 100 × 100 μm squares spaced by 150 × 150 μm intervals for the SN or 300 × 300 μm intervals for the VTA. Neurons were counted within a 25 μm thick zone within the section if they did not contact any of the three rejection boundary planes. Cell density was estimated as the ratio of the number of TH cells in the ROI over the volume of the ROI measured by the Mercator software (Explora Nova, Bordeaux, France). All data were then normalized with respect to the vehicle group.

Neurotransmitters and Metabolites Tissue Levels

Tissue levels of neurotransmitters (DA, noradrenaline, and serotonin) and metabolites (DOPAC, HVA, and 5-HIAA) were quantified by high-performance liquid chromatography with electrochemical detection (HPLC-ED) as described in Parrot et al (2011). The results were expressed in ng or μg of neurotransmitters per g of brain tissue.

Quantitative Real-Time PCR

Brains were quickly removed and ROIs were dissected and snap-frozen until RNA extraction. Total RNA was isolated by using an RNeasy Lipid Tissue Kit (Qiagen). RNA concentration and integrity was measured using a Nanodrop 1000 (Thermo Scientific) and Experion RNA HighSens Analysis kit (Bio-Rad). cDNA was synthesized using 50 ng of RNA and iScript cDNA Synthesis kit (Bio-Rad). Reaction with RNA without reverse transcriptase was performed to check for genomic DNA contamination. Expression levels of genes of interest were determined by quantitative real-time PCR (qPCR) on a CFX96 thermal cycler using iQ SYBR Green Supermix (Bio-Rad). The qPCR protocol comprised 5 min at 95 °C; 50 cycles of 20 s at 95 °C, 30 s at 60 °C, and finally a melting curve analysis. Expression of DA receptor subtypes was investigated using the following primers (PrimerDesign and Eurogentec): D1 (5′-ACCGAGGATGACAACTGTGA-3′/5′-TAGATACTGGTGTAGGTGACGAT-3′; D2L: 5′-AACCTGAAGACACCACTCAAGGAT-3′/5′-TGCTTGACAGCATCTCCATTTC-3′; D2S: 5′-CCCACCCTGAGGACATGAAA-3′/5′-CCGCCTGTTCACTGGGAA-3′/D4: 5′-TATGTCAACAGTGCCCTCAAC/ 5′-AGACATCAGCGGTTCTTTCAG-3′;D5: 5′-GGGAGAGGAGGAGGAGGAG-3′/5′-GGGGTGAGAGGTGAGATTTTG-3′. Assays were performed in duplicate and a standard curve from consecutive twofold dilutions of a cDNA pool for the brain areas of interest was included for each quantification. For each sample, the quantification cycle value (Cq) was normalized to two control reference genes (B2m and Ubc) and the relative expression fold change was calculated using the 2−ΔΔCq method with the vehicle group as control group.

Statistics

Statistical analyses were performed using Student’s t-test or analyses of variance (ANOVA) followed, when required, by Dunnett’s post hoc test. Behavioral data were compared using repeated-measures ANOVA with session and condition as within-subject factors, and treatment as between-subject factor. Analyses were performed using StatView software. The alpha risk for rejection of the null hypothesis was fixed at 0.05.

RESULTS

Quinpirole-treated rats presented a normal body growth from weaning to adulthood (data not shown; F(2, 43)=1.7, NS) and normal body weight at the beginning of experiments (control: 270 g±20; QUIN 0.1: 285 g±6; QUIN 0.5: 258 g±4; F(2, 43)=1.4, NS). During quinpirole treatment, they did not present any gross behavioral abnormalities. Furthermore, when tested as adults, rats showed normal locomotor behavior in the activity chambers (control: 1917±235; QUIN 0.1: 1620±145; QUIN 0.5: 1606±487 beam breaks; F(2, 15)=0.3, NS).

Neurobiological Consequences of the Quinpirole Treatment

DA innervation

We first quantified the density of TH immunoreactive fibers in QUIN-treated groups and vehicle-treated rats either after the end of the pharmacological treatment or after the completion of behavioral investigations. As no significant difference was observed between these two conditions, rats were pooled to provide three groups: control, n=9; QUIN 0.1, n=12; QUIN 0.5, n=10.

Quinpirole exposure during adolescence dose-dependently affected the density of TH-immunoreactive fibers in all subareas of the medial prefrontal cortex (mPFC) including ACC, PLC, and ILC (Figure 1a). Quinpirole effects were especially observed in the anterior part of the mPFC (treatment × anteroposterior level interaction: ACC: F(1, 28)=6.2, P<0.01; PLC: F(2, 28)=4.1, P<0.05; ILC: F(2, 28)=7.5, P<0.01). Separate analyses revealed a significant effect of treatment only in the anterior part of the mPFC (ACC: F(2, 28)=3.5, P<0.05; PLC: F(2, 28)=3.6, P<0.05; ILC: F(2, 28)=6.2, P<0.01) and post hoc Dunnett’s test revealed that the decrease in TH fibers density was specifically observed in the QUIN 0.5 group (all P<0.05).

Figure 1
figure 1

(a) Density of tyrosine hydroxylase (TH) immunostained fibers in the anterior (Ant.) and posterior part (Post.) of the mPFC (including ACC, PLC, and ILC) in vehicle- (control, white bars; n=9) or quinpirole-treated rats (QUIN 0.1, gray bars; n=12/QUIN 0.5, black bars; n=10). Right microphotographs show a representative TH immunostaining in superficial and deep layers (I–VI) of the PLC for control and QUIN 0.5 rats (scale bar=200 μm). Insets represent higher magnification of delineated areas. (b) Density of dopamine-β-hydroxylase (DBH) immunostained fibers in the anterior and posterior part of the mPFC. (c) Density of TH immunostained fibers in the dorsal striatum (DMS and DLS) and the nucleus accumbens (NAc). (d) Density of DA cells in the ventral tegmental area (VTA) and substantia nigra (SN). Right microphotographs show a representative view of TH immunostaining in midbrain DA areas. All data are expressed in the same arbitrary units (mean±SEM). *P<0.05 vs control group (one-way ANOVA followed by Dunnett’s post hoc test).

PowerPoint slide

To evaluate the possible effect of quinpirole exposure on the noradrenergic TH-expressing fibers, we also performed a DA β-hydroxylase (DBH) immunostaining (Figure 1b). Unlike DA innervation, noradrenergic fibers density in the mPFC was unaffected by quinpirole treatment. A two-way ANOVA revealed no significant effects of treatment (all Fs<0.8, NS) and only a weak trend towards a treatment × anteroposterior level interaction (ACC: F(2, 28)=2.5, P=0.09; PLC: F(2, 28)=1.6, P=0.2; ILC: F(2, 28)=2.5, P=0.1).

Unlike in prefrontal areas, TH fibers density was mostly unaffected in striatal areas including DMS, DLS and NAc (all Fs<2.0, NS; Figure 1c). Moreover, quinpirole chronic exposure did not induce neuronal loss of DA cells in the VTA and SN (all Fs<1.1, NS, Figure 1d).

DA tissue content and metabolism

In a separate batch of rats, we next investigated the concentration of DA and its metabolites (DOPAC and HVA) in prefrontal and striatal areas using HPLC-ED (n=5 per group). In accordance with the decrease in TH fibers density observed in the mPFC, the chronic exposure to quinpirole dose-dependently reduced DA levels in adult animals (F(2, 12)=6.2, P<0.05; Figure 2a). Once again, the effect was mostly observed with the highest dose of quinpirole (Dunnett’s test, P<0.01). Moreover, DA tissue levels also decreased in dorsal (F(2, 12)=3.4, P=0.06) and ventral (F(2, 12)=4.8, P<0.05) striatal areas for the QUIN 0.5 group (Dunnett’s test, P<0.05). By contrast, HVA/DA and DOPAC/DA ratios in prefrontal and striatal regions were unaffected by quinpirole treatment (all Fs<1.3, NS; Figure 2b). Chronic exposure to quinpirole thus appears to affect DA synthesis but not metabolism. Furthermore, our data show that quinpirole effects were specific to the DA system and altered neither noradrenergic nor serotoninergic contents in prefrontal and striatal regions (all Fs<0.7, NS; Table 1).

Figure 2
figure 2

HPLC quantification of DA content (a) and DA metabolites/DA ratios (b; HVA/DA, top; DOPAC/DA, bottom) in the mPFC, dorsal striatum and nucleus accumbens for vehicle- (control, white bars) or quinpirole-treated rats (QUIN 0.1, gray bars/ QUIN 0.5, black bars). All data are expressed as ng or μg of neurotransmitters per g of brain tissue (mean±SEM). n=5 per group. *P<0.05, **P<0.01 vs control group (one-way ANOVA followed by Dunnett’s post hoc test).

PowerPoint slide

Table 1 Noradrenaline (NE), Serotonin (5-HT) and its Metabolite, 5-Hydroxyindole-3-Acetic Acid (5-HIAA), Content in mPFC, Dorsal Striatum, and Nucleus Accumbens of Vehicle- (Control) and Quinpirole-Treated (QUIN) Rats

DA receptors expression

Finally, we investigated the putative effects of quinpirole exposure during adolescence on the expression of D1-like (D1 and D5) and D2-like (D2L (long) and D2S (short) isoforms, and D4) postsynaptic receptors in DA projection areas, by means of quantitative real-time PCR (n=5 per group). Consistent with the previous results, only the highest dose of quinpirole induced significant changes in the expression of DA receptors. QUIN 0.5 rats presented a lower expression of D1 and D2 receptors in the mPFC (all Fs>4.2, P<0.05; Dunnett’s post hoc test, P<0.05; Figure 3a). The same treatment induced only marginally affected the expression of DA receptors in dorsal striatum (all Fs<0.9, NS; Figure 3b) and NAc (all Fs<0.7, NS; Figure 3c).

Figure 3
figure 3

Mean fold change of expression levels of DA receptors mRNA (D1/D5 and D2L (long)/D2S (short)/D4) quantified by qPCR in mPFC (a), dorsal striatum (b) and nucleus accumbens (c) for vehicle- (control, white bars) or quinpirole-treated rats (QUIN 0.1, gray bars/ QUIN 0.5, black bars). All data are expressed as mean±SEM. n=5 per group. *P<0.05 vs control group (one-way ANOVA followed by Dunnett’s post hoc test).

PowerPoint slide

Behavioral Consequences of the Quinpirole Treatment

Instrumental learning

Behavioral investigations were conducted in drug-free conditions. Both control (n=5) and quinpirole-treated groups (QUIN 0.1, n=7; QUIN 0.5, n=6) gradually increased their lever pressing rates to obtain food reward during training (Figure 4a). No significant difference was observed between the three groups in lever pressing rates (treatment: F(2, 15)=0.4; NS) or in learning rates (session: F(9, 135)=72.9; P<0.001; session × treatment interaction: F(18, 135)=0.7; NS). The level of instrumental response differed for the two rewards (F(1, 15)=150.7; P<0.001). However, no significant interaction was observed between this factor and the others during instrumental learning and subsequent behavioral procedures. The data corresponding to the two rewards were therefore pooled.

Figure 4
figure 4

(a) Acquisition of lever pressing response during the 10 sessions of instrumental learning for control (diamonds, n=5), QUIN 0.1 (circles, n=7), and QUIN 0.5 group (squares, n=6). (b) Evolution of instrumental responses during the eight sessions of contingency degradation when the contingency was maintained (white symbols) or degraded (black symbols). (c) Lever pressing rate during the 5-min choice extinction test between levers associated with the previously intact (white) or degraded (black) contingency. Data are expressed as mean±SEM relative to the lever press rate observed during the last session of instrumental training. *P<0.05, NS, nonsignificant (two-way ANOVA).

PowerPoint slide

Contingency degradation

After training, we tested whether instrumental performance was controlled by action consequences. One action–outcome contingency was degraded by letting the animals obtain the reward without performing the action. Meanwhile, the other lever–reward contingency was maintained. Both control and QUIN 0.1 groups reduced their response rates specifically for the degraded contingency. By contrast, QUIN 0.5 group failed to adapt its actions to the changes in action–outcome relationships and presented similar response rates independently of the contingency condition (Figure 4b). Statistical analyses revealed an effect of treatment (F(2, 15)=3.7; P<0.05) and contingency (F(1, 15)=27.6; P<0.001) and an absence of significant interaction between these factors (F(2, 15)=1.8; NS). A further analysis of the degraded condition showed a significant effect of treatment (F(2, 15)=4.6; P<0.05), with the QUIN 0.5 group differing from the control group (Dunnett’s test P<0.05). The same analysis on the non-degraded condition showed no significant effect of treatment (F<1).

Deficits in response sensitivity to contingency changes were also observed during the subsequent choice extinction test (Figure 4c). In contrast to control and QUIN 0.1 groups, the QUIN 0.5 group presented the same levels of response for the degraded and non-degraded contingencies, confirming the inability of QUIN 0.5-treated rats to adapt their behavior to new action rules. Repeated-measures ANOVA revealed an effect of contingency (F(1, 15)=7.1; P<0.05) and a treatment × contingency interaction (F(2, 15)=3.9; P<0.05). Separate analyses demonstrated an effect of contingency in control (F(1, 4)=18.1; P<0.05) and QUIN 0.1 (F(1, 6)=12.4; P<0.05) groups but not in the QUIN 0.5 group (F(1, 5)=0.5; NS).

Outcome devaluation

We then tested the sensitivity of instrumental responses to the current outcome value using sensory-specific satiety (Figure 5a). In contrast with the contingency degradation, all groups specifically reduced their response rates on the lever associated with the devalued outcome (Figure 5b). Statistical analyses revealed a significant effect of devaluation (F(1, 15)=70.3; P<0.001), but no effect of treatment (F(2, 15)=0.7; NS) nor treatment × condition interaction (F(2, 15)=1.9; NS). Moreover, the three groups also correctly consumed higher quantities of non-devalued reward during the subsequent consumption test (Figure 5c) (devaluation: F(1, 15)=47.3; P<0.001; treatment: F(2, 15)=0.9; NS; treatment × condition interaction: F(2, 15)=0.2; NS).

Figure 5
figure 5

(a) After the learning phase (left), rats had free access to one food reward to induce sensory specific satiety (middle). Immediately after, they performed a test in extinction to assess the impact of the change in outcome value on action selection (right). (b) Lever pressing rate during the 5-min choice extinction test on levers associated with the devalued (black) or non-devalued (white) reward in control (n=5), QUIN 0.1 (n=7), and QUIN 0.5 rats (n=6). (c) Consumption levels of devalued and non-devalued reward during consumption tests for the three groups. Data are expressed as mean±SEM relative to the lever press rate observed during the last session of instrumental training. *P<0.05, **P<0.001 (two-way ANOVA).

PowerPoint slide

DISCUSSION

This study reveals that chronic stimulation of D2 receptors during adolescence, leads to profound neurobiological and cognitive alterations in adulthood. Indeed, animals treated with the highest dose of quinpirole present a selective decrease in response sensitivity to changes in action–outcome relationships. This cognitive deficit is associated with a specific alteration of the mesocortical DA pathway.

Our results show that quinpirole treatment during a period encompassing adolescence has a massive deleterious impact on the developing DA system, decreasing DA fibers density, DA tissue concentration, and DA receptors expression. Several aspects of these results deserve comments. First, the pharmacological treatment selectively impacts the DA system since we do not observe any change in the other monoamine systems. Second, quinpirole effects on DA systems do not result from a mere neurotoxic effect as the overall density of DA cells in the VTA and SN is unaffected. Third, and most interestingly, quinpirole exposure has a more prominent effect on the mesocortical DA pathway as compared with mesostriatal DA systems. Development of the mesocortical DA system is delayed during adolescence (Andersen et al, 2000; Kalsbeek et al, 1988; Naneix et al, 2012; O’Donnell, 2010). Thus, the present results strongly suggest that the effects of quinpirole reported here directly result from its impact on DA fibers dynamics during adolescent development.

The mechanisms underlying these alterations remain unclear. Systemically administered quinpirole may act on D2 receptors that are present at both presynaptic and postsynaptic sites. However, in vitro studies have previously demonstrated that chronic activation of D2 autoreceptors by quinpirole reduces the growth of DA fibers and the formation of DA synapses (Fasano et al, 2008), suggesting that the decrease in DA fibers density reported here might involve a presynaptic effect. D2 autoreceptors are also known to regulate the activity of TH, the rate-limiting enzyme of DA synthesis (Usiello et al, 2000). However, in vitro studies indicate that chronic D2 stimulation does not affect the expression and the activity of TH (Fasano et al, 2008, 2010). Alternatively, the decrease in DA concentration might be a consequence of the reduced DA fibers density in projection areas, as is observed after chronic cocaine administration (Lee et al, 2011). Furthermore, the alteration of DA innervation might also lead to a decrease in DA receptors expression associated with a reduction of synapses density (Parish et al, 2001). Nevertheless, a postsynaptic effect of quinpirole cannot be excluded, particularly concerning the expression of DA receptors. Indeed, prolonged stimulation of DA receptors is known to induce receptor desensitization (Fasano et al, 2010), internalization (Dumartin et al, 1998), or expression decrease (Briand et al, 2008).

A broad number of studies have demonstrated that the control of goal-directed instrumental response depends on complex interactions between prefrontal (Corbit and Balleine, 2003; Killcross and Coutureau, 2003; Tran-Tu-Yen et al, 2009; Valentin et al, 2007) and striatal areas (Tanaka et al, 2008; Yin and Knowlton, 2006). Moreover, separate DA pathways appear to be differentially involved in specific cognitive processes underlying goal-directed actions (Faure et al, 2005; Hitchcott et al, 2007; Lex and Hauber, 2010c; Naneix et al, 2009). Quinpirole-treated rats were not impaired in learning the instrumental task. This behavioral result agrees with reports that prefrontal areas are not required for initial learning and performance of instrumental responses (Coutureau et al, 2012; Gourley et al, 2009; Killcross and Coutureau, 2003; Naneix et al, 2009; Ostlund and Balleine, 2005; Tran-Tu-Yen et al, 2009, but see Corbit and Balleine, 2003; Lex and Hauber, 2010b). The task was acquired although DA content was somewhat reduced in the mesolimbic DA pathway, which is known to be involved in the learning of motivated responses (Salamone and Correa, 2002). Furthermore, the DA system does not seem required for the adaptation to changes in action outcome values (Lex and Hauber, 2010b, 2010c; Naneix et al, 2009). Accordingly, there was no difference here in action sensitivity to outcome devaluation between vehicle- and quinpirole-treated rats.

By contrast, rats treated with the highest dose of quinpirole were impaired during contingency degradation. This effect cannot result from impairments in learning the initial action–outcome association. Indeed, both control and QUIN-treated rats were able to selectively diminish their response associated with the sated reward during the outcome devaluation procedure. Alternatively, increased magazine approaches because of free reward delivery might compete with lever press responses. However, the same insensitivity to contingency changes was also observed during the subsequent extinction choice test, demonstrating that the impairment of response adaptation results from a deficit in learning the new action–outcome association rather than from response competition.

Our previous studies showed that action adaptation to action–outcome changes requires the mesocortical DA pathway in adult animals (Naneix et al, 2009). Moreover, we have recently shown that this cognitive process appears between adolescence and adulthood and parallels the maturation of the DA cortical system (Naneix et al, 2012). Consistent with these studies, we demonstrate here a similar cognitive deficit in adults when DA development during adolescence is altered.

There is evidence that the detection of contingency changes also depends on DA activity in the dorsal striatum (Lex and Hauber, 2010b; Yin and Knowlton, 2006). Since we observed in this study a decrease in DA content and DA receptors expression in dorsal striatal areas, we cannot exclude a potential involvement of these alterations in the behavioral deficit. Interestingly, both prefrontal cortex and striatum receive important projections from the hippocampal formation, and a similar behavioral pattern is observed after lesions of the hippocampal formation (Corbit et al, 2002; Lex and Hauber, 2010a). This has led some authors to suggest that deficits in adaptation to contingency degradation may result from an alteration of context conditioning, more precisely of the subject’s ability to encode context–reward associations (Corbit and Balleine, 2000; Corbit et al, 2002; Lex and Hauber, 2010a). It could therefore be hypothesized that DA signaling in the prefrontal cortex and striatum acts as a gating process for the integration of contextual information (Goto and Grace, 2008).

Action adaptation during contingency degradation requires updating previous action–outcome relationships to establish a new strategy. Prefrontal areas and the mesocortical DA pathway have a central role in working memory and behavioral flexibility (Floresco and Magyar, 2006). Moreover, DA release in the mPFC appears to be strongly correlated to the rate of behavioral adaptation during a rule shifting task (Stefani and Moghaddam, 2006). The unexpected rewards that occur during contingency changes should elicit a reward prediction error signal in the form of phasic DA activity (Montague et al, 2004; Schultz, 2000). Furthermore, DA actions on D1 and D2 receptors differentially affect information processing by modulating the activity in local neuronal networks (Durstewitz and Seamans, 2008; Goto and Grace, 2008). Hence, the decrease in DA receptors expression could affect the ability of DA to modulate plateau depolarization via D1 receptors or D2-mediated recruitment of GABAergic interneurons (Seamans and Yang, 2004). Altering the mesocortical DA pathway by quinpirole might therefore prevent the detection of contingency changes by reducing DA modulation of prefrontal networks.

The DA system is central to reward-based learning and is the main target of most natural and pharmacological rewards. While most studies investigating drug effects focus on adult subjects, the first exposure and the initiation of drug consumption often occur during adolescence, which represents an important period of vulnerability (Adriani and Laviola, 2004; Andersen, 2003; Crews et al, 2007). Recent studies on adolescent subjects have shown that long-term stimulation of the reward system leads to important neurobiological changes especially in prefrontal areas (Black et al, 2006; Counotte et al, 2011) and alteration of response for natural rewards and/or drugs at the adult stage (Adriani and Laviola, 2004; Catlow and Kirstein, 2007; Vendruscolo et al, 2010). Drugs of abuse, by increasing DA functioning, may stimulate a variety of DA receptors. Our results, by focusing on D2 receptors, suggest a possible mechanism for the long-term impact of drugs of abuse. Moreover, the developing mesocortical DA system appears especially sensitive to prolonged stimulation, leading to important deficits in decision-making abilities. The current findings therefore contributes to our understanding of several decision-making disorders developing during adolescence (Paus et al, 2008) and involving alterations of DA system (Hyman et al, 2006).