INTRODUCTION

Abnormal reactions to positive or negative events are associated with pathological states such as anxiety and depression (Beats et al, 1996; Elliott et al, 1997, 2000; Graeff et al, 1996; Murphy et al, 2003). Such behavioral aberrations, for example, are common in depressive states in which patients’ reactions to negative feedback are often exaggerated, whereas positive events fail to adequately reinforce behavior (Beats et al, 1996; Elliott et al, 1996; Henriques et al, 1994; Hughes et al, 1985; Murphy et al, 2003; Robbins et al, 1996; Steffens et al, 2001). The abnormal response to negative feedback showed by depressive patients can be seen as part of a larger motivational deficit and is considered to be an important link between emotional and cognitive disturbances in depression (Elliott et al, 1997).

Serotonin (5-hydroxytryptamine, 5-HT) has long been implicated in the pathophysiology of mood and anxiety disorders, and in the processing of affective stimuli (Deakin, 1991; Graeff et al, 1986). More specifically, central 5-HT has long been thought to have a critical role in the adaptation of the animals to aversive events (Deakin and Graeff, 1991) and, recently, to mediate a negative prediction error signal for future threat and punishment (Cools et al, 2008a; Daw et al, 2002). Accordingly, the majority of studies on cognitive and affective processing in depressed patients highlight strong biases toward negative stimuli and away from positive ones, which interfere with normal cognitive functioning (Elliott et al, 1996, 1997; Gotlib et al, 2004; Murphy et al, 2003; Roiser et al, 2009; Siegle et al, 2001), whereas long-term treatment with selective serotonin reuptake inhibitors (SSRIs) counteracts these biases. Moreover, lowering brain content of 5-HT by acute tryptophan depletion (ATD) in healthy subjects causes similar biases in neuropsychological tasks assessing affective decision-making and punishment prediction (Cools et al, 2008b; Murphy et al, 2002; Rogers et al, 2003; Roiser et al, 2006). In addition, in rats and other animals, 5-HT modulates the processing of aversive stimuli (Burghardt et al, 2004; Dayan and Huys, 2009; Inoue et al, 2004; Stutzmann and LeDoux, 1999; Wilkinson et al, 1995).

An effective way of measuring an individual's reaction to positive and negative feedback is to assess lose-shift and win-stay behavior in a two-choice probabilistic decision-making task (Paulus et al, 2002, 2003). In this study, we used an operant version of the probabilistic reversal learning task (PRL; Lawrence et al, 1999; Swainson et al, 2000) adapted for the first time for use in rats. The PRL task is widely adopted in clinical settings and has been proven highly sensitive in detecting cognitive deficits in a wide array of pathological states such as schizophrenia (Waltz and Gold, 2007), Parkinson's disease (Cools et al, 2001; Peterson et al, 2009; Swainson et al, 2000), and bipolar (Roiser et al, 2009) and depressive (Murphy et al, 2003; Taylor Tavares et al, 2008) disorders.

With this paradigm it is possible to estimate the conditional probability of switching a response on receiving spurious negative feedback (ie, lack of reward) and that of maintaining a ‘response set’ that is rewarded on a high proportion of trials. The probabilistic nature of this task enhances the occurrence of error processing, thus making it particularly suitable for detecting changes in reward and negative feedback sensitivity (NFS).

Manipulating brain 5-HT levels by acute administration of the SSRI citalopram has been shown to cause deficits in acquiring a PRL task and increased NFS in healthy volunteers (Chamberlain et al, 2006). An enhanced impact of failure on subsequent performance can also be observed in depressed patients during the execution of the PRL task (Murphy et al, 2003). However, these results are in apparent contrast to the widespread use of SSRIs in depressed patients and with recent findings on the beneficial effect of the potent SSRI escitalopram in a PRL paradigm in rats (Brown et al, 2008). Thus, is not clear whether SSRI administration improves or impairs behavioral flexibility and whether depressed patients that are oversensitive to negative feedback—and hyposensitive to positive feedback (Henriques et al, 1994)—would benefit from SSRIs treatment also for this aspect of the pathology.

To better understand the role of 5-HT on cognitive processes affected by performance feedback, we tested the effects of various manipulations of the 5-HT system on the sensitivity of rats to negative and positive feedback with the PRL task. We administered three doses of the SSRI citalopram in experiment 1 and tested the effects of repeated and subchronic administration of the same drug in the second experiment. Then, we challenged the same animals with a low dose of citalopram, which is known to further increase 5-HT levels in prefrontal areas of subchronically treated animals (Invernizzi et al, 1994; Muraki et al, 2008). In experiment 3 we assessed animals’ performance after global depletion of brain 5-HT. We anticipated that lowering brain 5-HT content would lead to increased lose-shift behavior, according to published literature and recent theorizing (Cools et al, 2008a). Boosting 5-HT neurotransmission, instead, would decrease the impact of negative feedback on subsequent performance increasing the number of reversals completed by the animals.

MATERIALS AND METHODS

Subjects

Animals used in this study were male adult Lister hooded rats (Charles River, UK). All subjects were housed in groups of four under a reversed 12 : 12 h light–dark cycle (white lights off at 0730 hours), and were tested during the dark phase of this cycle. In all experiments rats were mildly food restricted to approximately 85% of their free feeding weights. This was achieved by providing 15–20 g of food per day (Purina laboratory chow). Food restriction started at least 1 week before the beginning of training. Water was freely available except during test sessions. All experiments were conducted in accordance with the United Kingdom Animals (Scientific Procedures) Act, 1986 (PPL 80/2234).

Training Procedure

Rats were trained in 12 five-hole operant chambers (Med Associates, Georgia, VT) described in detail elsewhere (Bari et al, 2008) and controlled by the Whisker control system (Cardinal and Aitken, 2001). The experimental procedures were implemented using custom software written in Visual Basic by ACM. After being habituated to the test environment with free reward pellets in the food magazine, rats where presented with only one hole illuminated per trial randomly assigned to a hole adjacent (immediately left or right) of the central aperture. A nose poke in the illuminated hole resulted in the delivery of a single food pellet (Noyes dustless pellets; Sandown Scientific, Middlesex, UK) in the food magazine situated on the opposite wall of the chamber. Retrieval of the pellet from the magazine immediately initiated the next trial. A nose poke in any nonilluminated hole resulted in a 5 s time-out. Performance was considered to have stabilized when accuracy levels were at least 90% over two consecutive sessions. Animals that showed a side bias were excluded from the study. This was defined by choice behavior greater than 55% on one of the two holes (ie, more than 5% of errors made on the same hole).

When subjects reached stable levels of accuracy they were presented with two simultaneously illuminated apertures on either side of the central aperture. Reinforcement contingencies were modified so that one hole was the ‘correct’ location giving food pellets on 80% of trials, and the other was the ‘incorrect’ location that resulted in reinforcement on 20% of trials when chosen. On nonrewarded trials the animals received a 2.5 s time-out. The light stimuli were presented for 30 s and, if there was no response within this time period, the trial was deemed an omission, which triggered a 5 s time-out. A response made in one of the unlit holes was without any programmed consequence. If a subject chose the ‘correct’ hole on eight consecutive occasions, the contingencies were reversed such that the ‘correct’ hole became the ‘incorrect’ hole and the previously ‘incorrect’ hole became the ‘correct’ hole. Animals were given one session per day, each consisting of 200 trials to be completed within 40 min.

Experiment 1: Acute Citalopram Administration

Twelve Lister hooded rats were initially trained as described above. One rat was excluded for side bias. After five drug-free sessions, performance was stable and drug testing started. Animals were divided into four groups matched for performance accuracy and injected intraperitoneally (i.p.) with vehicle or citalopram hydrobromide (1, 5, or 10 mg/kg; Tocris, Bristol, UK) according to a fully randomized Latin square design. Citalopram was dissolved in 0.01 M phosphate-buffered saline (PBS) and administered 30 min before the test session.

Experiment 2: Repeated and Subchronic Citalopram Administration

Eighteen rats were trained on the PRL task. After excluding side-biased animals 14 rats were divided into two groups matched for baseline performance. The citalopram group (‘Cit’) received 5 mg/kg of citalopram 30 min before testing for 7 consecutive days, whereas the vehicle group (‘Veh’) received the same number of daily injections of 0.01 M PBS. This dose of citalopram was chosen because it did not influence performance when administered acutely in experiment 1.

After 7 days the Cit group received 10 mg/kg of citalopram twice a day (b.i.d., at least 8 h apart and 4 h before the test sessions) for 5 consecutive days. We reasoned that this regimen of drug administration would allow a separation of the effects caused by repeated citalopram administration shortly before testing from long-lasting effects of subchronic dosing. The Veh group received the same treatment regimen, but of PBS injections. On the thirteenth day, all animals received an injection of 3 mg/kg of citalopram 30 min before the start of the test session.

Experiment 3: Forebrain 5-HT Depletion

Eighteen rats went through the initial training and baseline sessions as in experiment 1 and none of them showed side bias. They were divided into two groups matched for baseline performance, pretreated with 20 mg/kg of desipramine HCl (i.p.; Sigma, Poole, UK) to protect noradrenergic neurons and, after 30 min, anesthetized with 100 ml/kg Avertin (10 g of 2,2,2-tribromoethanol (Sigma) in 5 g of tertiary amyl alcohol, diluted in a solution of 40 ml ethanol and 450 ml PBS). They were subsequently secured in a stereotaxic frame with the incisor bar set at −3.3 mm relative to the interaural line. Two holes were drilled in the skull and bilateral intracerebroventricular (i.c.v.) infusions were carried out through stainless steel cannulae at the following stereotaxic coordinates: AP −0.9 mm from bregma, L±1.5 mm from the midline, and DV −3.5 mm from dura. Nine rats received bilateral infusions of 80 μg 5,7-dihydroxytryptamine (5,7-DHT) creatinine sulfate diluted in 10 μl 10% ascorbic acid in saline and control rats (n=9) received 10 μl of vehicle. After each infusion injectors were left in place for 2 min before withdrawal.

Rats were left undisturbed for 10 days allowing complete recovery from surgery and degeneration of 5-HT neurons (Bjorklund et al, 1975) before being re-trained. At the end of the experiments rats were killed by CO2 asphyxiation and their brains were removed quickly and rapidly frozen on dry ice. Bilateral aliquots of tissue (0.4–2 mg) were subsequently obtained from coronal sections (150 μm thickness) using a stainless steel micropunch (0.75 mm diameter) from the following brain regions: prelimbic cortex (PrL), anterior cingulate cortex (ACC), orbitofrontal cortex (OFC), nucleus accumbens (NAC), dorsomedial striatum (DMS), dorsolateral striatum (DLS), amygdala (Amg), and hippocampus (HPC).

Aliquots were weighed and then homogenized in 100 μl 0.2 M perchloric acid. Samples were centrifuged at 6000 r.p.m. for 20 min at 4°C and the supernatant was analyzed for DA, dihydroxyphenylacetic acid (DOPAC), norepinephrine (NE), 5-HT, and 5-hydroxyindoleacetic acid (5-HIAA) content by reversed-phase, high-performance liquid chromatography, as described previously (Matthews et al, 2001).

Main Measures and Statistical Analysis

Data analysis

Data were compiled in a relational database (Microsoft Access 2002) and analyzed using SPSS 12.0.1 (SPSS, Chicago, IL, USA). Win-stay/lose-shift behavior, reversals completed, latency data (to respond to the stimuli and to collect food pellets), and total number of ‘incorrect’ choices and pellets earned were subjected to repeated-measures analyses of variance (ANOVAs), with ‘dose’ (four levels: Veh, Cit 1, Cit 5, and Cit 10) as within-subject factor in experiment 1, and with ‘group’ (two levels: sham and lesion or Cit and Veh) as between-subject factor and ‘session’ as within-subject factor in experiments 2 and 3. Neurochemical data were analyzed by independent samples t-test (two tailed). All tests of significance were performed at α=0.05. Homogeneity of variance was verified using Levene's test. For repeated-measures analyses, the Mauchly's test of sphericity was applied and the degrees of freedom were corrected to more conservative values using the Huynh–Feldt ɛ whenever the sphericity assumption was violated. Corrected degrees of freedom are presented rounded to the nearest integer. For pair-wise comparisons, values were adjusted using Sidak's correction factor for multiple comparisons (Howell, 1997).

Animals’ choices during the task were analyzed according to the outcome of each preceding trial (reward or punishment) and expressed as a ratio. The proportion of win-stay (ie, the number of times the animal received a reward and then chose the same hole as that chosen on the previous trial, divided by the total number of rewarded trials) and lose-shift (ie, the number of times the animal changed its choice after having been punished on the previous trial, divided by the total number of punished trials) scores for the ‘correct’ hole only were analyzed separately. NFS was defined as the likelihood that the subject will choose the ‘incorrect’ stimulus after a ‘correct’ choice followed by a misleading negative feedback (ie, lose-shift). Reward sensitivity, instead, was defined as the overall likelihood that the subject does not shift away from a rewarded ‘correct’ choice (ie, win-stay). Because in experiment 1 animals perform only one test session for each drug dose, win-stay/lose-shift ratios were analyzed separately for the acquisition phase and the first reversal phase to increase the sensitivity of the experimental design used.

RESULTS

Experiment 1: Effects of Acute Citalopram Administration

Citalopram affected the total number of reversals achieved (Figure 1a; F(3, 22)=9.973, p<0.001). Although the lowest dose (1 mg/kg) decreased the number of reversals (p<0.005), the highest dose (10 mg/kg) increased the number of reversals completed (p<0.05) compared with vehicle-treated animals (Figure 1a). No significant differences were found between the vehicle control group and the intermediate dose group (5 mg/kg). Citalopram treatment did not influence win-stay performance, although there was a significant effect on lose-shift behavior during the reversal phase (Figure 1d; F(3, 30)=11.520, p<0.001). Pair-wise comparisons showed that after 1 mg/kg citalopram administration, lose-shift behavior was significantly higher than vehicle-treated animals (p<0.05), whereas after 10 mg/kg administration it was significantly lower (p<0.05). There was no significant effect of treatments on latencies, number of pellets earned, and total number of ‘incorrect’ choices.

Figure 1
figure 1

Effects of acute citalopram administration: (a) the lowest dose (1 mg/kg) decreased the number of reversals achieved by the animals, whereas 10 mg/kg increased it. (b) Citalopram administration did not influence performance during the acquisition phase, but increased lose-shift behavior at 1 mg/kg and decreased it at 10 mg/kg during reversal 1 (c) (*p<0.05 compared with vehicle-treated animals).

In summary, acute citalopram administration exerted opposing effects on PRL performance depending on the dose: a high dose increased the number of reversals completed and decreased the probability of shifting after receiving negative feedback, whereas a low dose caused the animals to complete fewer reversals and made them more likely to shift after receiving spurious negative feedback.

Experiment 2: Effects of Repeated (5 mg/kg per day) and Subchronic (10 mg/kg, b.i.d.) Administration of Citalopram

Before undergoing drug administration, two groups of animals matched for number of reversals achieved over the last three sessions were formed (seven animals in the group Veh and seven animals in the group Cit). In the first part of the experiment (5 mg/kg citalopram daily, 30 min before sessions), there was no effect of the repeated treatment on the number of reversals completed over seven sessions. However, there was a significant difference between the two groups on win-stay behavior (F(1, 12)=6.454, p<0.05) with the Cit group showing higher reward sensitivity (Figure 2b). Lose-shift behavior (Figure 2c) was not different between the two groups (F(1, 12)=1.395, NS) during this part of the experiment, but there was a significant session by group interaction (F(6, 72)=2.398, p<0.05).

Figure 2
figure 2

Effects of repeated citalopram administration. Citalopram (5 mg/kg) increased win-stay (b), but had no effect on lose-shift (c) and on the number of reversals completed (a). Citalopram (10 mg/kg, b.i.d.) increased the number of reversals completed by the animals without any effect on lose-shift/win-stay behavior (*p<0.05, Cit vs Veh; BL, baseline).

In the second part of the experiment (10 mg/kg, b.i.d.) repeated-measures ANOVA detected a significant increase in the number of reversals completed (Figure 2a) by the Cit group (F(1, 12)=5.581, p<0.05), but no significant effects on any other variable.

In the third part of the experiment (Figure 3a), administration of a single dose of citalopram (3 mg/kg), 30 min before testing started to both groups, caused a significant increase in the number of reversals completed by the animals of the Cit group (F(1, 12)=5.444, p<0.05). There was also a significant difference on win-stay behavior between the two groups (F(1, 12)=7.992, p<0.05), with the Cit group showing higher levels of win-stay behavior after a single citalopram challenge and no significant difference between the two groups on lose-shift behavior (F(1, 12)=0.125, NS) (Figure 3b).

Figure 3
figure 3

Effects of a single dose of citalopram (3 mg/kg) in both subchronic citalopram (Cit)- and vehicle (Veh)-treated animals. This acute challenge increased both reversals completed (a) and win-stay behavior (b), but only in the Cit group (*p<0.05, Cit vs Veh).

Moreover, the acute challenge of citalopram caused the rats in the Cit group to respond faster to the stimuli compared to the rats in the Veh group (F(1, 12)=5.946, p<0.05). There were no other significant effects of treatments on reaction times, total pellets earned, and total number of ‘incorrect’ choices.

In sum, repeated (once a day, 5 mg/kg) citalopram administration just before the beginning of the test session increased win-stay, but did not influence other variables whereas subchronic dosing of the same compound (10 mg/kg, b.i.d.) increased the number of reversals completed by the animals (Figure 2). An acute injection of citalopram (3 mg/kg) further increased the number of reversals achieved, as well as win-stay behavior, but only in animals previously treated with the drug (Figure 3). Finally, the latter manipulation also speeded pretreated animals’ reaction time.

Experiment 3: Effects of Forebrain 5-HT Depletion

Ex vivo neurochemistry

Consistent with previously published reports (Harrison et al, 1997; Winstanley et al, 2003, 2004), animals that received i.c.v. 5,7-DHT infusions showed an almost total depletion of brain 5-HT (Table 1; 5-HT: PrL, t(14)=−2.17; ACC, t(7)=−2.87; OFC, t(7)=−2.75; NAC, t(8)=−2.48; DMS, t(14)=−242; DLS, t(7)=−2.48; Amg, t(7)=−2.49; DHPC, t(7)=−3.24; all p<0.05; two tailed) and lower levels of the metabolite 5-HIAA in all the brain regions examined (5-HIAA, t(14)=5.64, p<0.001; two tailed). Levels of DA (PrL, t(14)=−0.160; ACC, t(14)=−0.091; OFC, t(14)=−0.654; NAC, t(14)=−1.431; DMS, t(14)=0.033; DLS, t(14)=−0.144; Amg, t(14)=0.742; DHPC, t(14)=−0.220; all NS; two tailed), NE (PrL, t(14)=0.056; ACC, t(14)=−0.396; OFC, t(14)=−0.306; NAC, t(14)=−0.964; DMS, t(14)=0.269; DLS, t(14)=0.093; Amg, t(14)=−1.804; DHPC, t(10)=0.755; all NS; two tailed), and DOPAC (t(14)=1.08, NS; two tailed) relative to controls.

Table 1 Effects of i.c.v. 5,7-DHT on Regional Brain Levels of 5-HT, DA, and NE Expressed as Pmol/mg (±SEM)

Probabilistic reversal learning

A few days after surgery, one animal (sham group) died from postoperative complications. One animal from the lesion group was eliminated from the analysis as it had incomplete 5-HT depletion. None of the remaining animals showed significant side bias. The final numbers were eight shams and eight lesioned rats. Data were analyzed over seven consecutive sessions starting from the second day on task after surgery. Repeated-measures ANOVA of win-stay performance revealed a significant main effect of session (F(5, 65)=4.869, p<0.005) and group (F(1, 14)=5.606, p<0.05) (Figure 4a) with lesioned animals showing less win-stay behavior as compared to controls. Analysis of lose-shift behavior (Figure 4b) showed a main effect of session (F(5, 68)=3.686, p<0.005). The effect of group on lose-shift was close to significance (F(1, 14)=4.13, p=0.062), but there was no interaction between session and group (F(5, 68)=1.291, NS). However, as we expected an effect of the lesion on the lose-shift metric, we limited the analysis to the first four sessions. In this case the ANOVA revealed a significant increase in lose-shift behavior for the lesion group (F(1, 14)=5.544, p<0.05), revealing the transient nature of the effect.

Figure 4
figure 4

Effects of global serotonin (5-HT) depletion: (a) 5-HT-depleted animals showed an enhanced tendency to shift after being rewarded (ie, decreased win-stay). (b) Lesioned animals showed a faster increase in lose-shift behavior compared with control animals, but the effect disappeared over sessions. (c) Lesioned rats completed fewer reversals than sham-operated rats. A significant effect of session indicates that both groups improved their performance over time.

The total number of reversals achieved by the animals (Figure 4c) increased over sessions (F(4, 52)=4.251, p<0.01), but no interaction was found with the variable group (F(4, 52)=1.372, NS). There was a significant effect of group in the number of reversals completed (F(1, 14)=15.774, p<0.001), with lesioned animals completing fewer reversals (p<0.001). Depleted animals were also faster to respond to the illuminated apertures (F(1, 14)=13.267, p<0.001), but no difference was found in the latency to collect the reward from the food magazine, in the total amount of food pellets earned, and in the number of ‘incorrect’ choices.

See Table 2 for a summary of the effects of the different experimental manipulations.

Table 2 Effects of Different 5-HT Manipulations on Measures of Probabilistic Reversal Performance

DISCUSSION

Our results show, apparently for the first time, how different manipulations of 5-HT neurotransmission in rats result in altered sensitivity to positive and negative feedback in the setting of a PRL task, analogous to that used in humans. Acute administration of a low dose of the SSRI citalopram (1 mg/kg), which decreases 5-HT system activity (see below), impaired subjects’ performance as expected, reducing the number of reversals completed. This effect was paralleled by an increase in lose-shift behavior (ie, higher NFS)—a result also observed in healthy human volunteers after acute citalopram administration (Chamberlain et al, 2006). By contrast, acute administration of a higher dose of citalopram (10 mg/kg) improved rats’ performance on the PRL task, consistent with a recent study using 1 mg/kg escitalopram (the S-enantiomer of citalopram) on a maze version of the same task (Brown et al, 2008). Escitalopram is more potent than the racemic compound citalopram in increasing extracellular 5-HT levels (Mork et al, 2003) and shows a faster onset of antidepressant action in both patients (Gorman et al, 2002) and animal models of depression (Sanchez et al, 2003).

Our results also show that depleting 5-HT from the rat forebrain unexpectedly modulated the impact of positive (and, only transiently, of negative) feedback on subsequent performance, making the rat more likely to switch away from a rewarded choice (ie, decreased win-stay). This result thus mirrors that obtained after increasing tonic levels of extrasynaptic 5-HT by repeated injections of citalopram (experiment 2). This latter manipulation resulted in the animals being more sensitive to reward (increased win-stay) and to perform better in the task.

Thus, in general, increasing 5-HT function, either by acute administration of a high dose of citalopram or by subchronic treatment with the same drug, caused an improvement in the performance of the animals, as characterized by a higher number of reversals completed. By contrast, reducing forebrain 5-HT activity by chronic depletion or by an acute low dose of citalopram impaired reversal performance. However, these effects were associated with different behavioral outcomes (in terms of win-stay/lose-shift behavior), depending on the specific experimental manipulation.

In experiment 1, we found contrasting effects of acute administration of a low compared to a high dose of citalopram. Various studies have shown that single administrations of clinically relevant doses of SSRIs cause a marked increase of extracellular 5-HT in midbrain raphé nuclei, but not in the prefrontal cortex (PFC) region (Bel and Artigas, 1992; Gartside et al, 1995; Invernizzi et al, 1992; Malagie et al, 1996). More relevant for the present study, Invernizzi et al (1992) showed that 5-HT levels, as measured by in vivo microdialysis, are increased in the PFC of freely moving rats after 10 (but not 1) mg/kg of citalopram administered i.p. This pattern of findings is consistent with electrophysiological studies in animals showing that a single challenge with a low dose of an SSRI temporarily inhibits the firing of 5-HT neurons due to inhibitory autoreceptor activation in the raphé nuclei (Artigas et al, 1996; Hajos et al, 1995). Low doses of SSRIs increase 5-HT levels to a level that floods somatodendritic 5-HT1a autoreceptors, causing inhibition of 5-HT neuronal discharge (Aghajanian et al, 1987), synthesis (Barton and Hutson, 1999; Carlsson and Lindqvist, 1978), and release (Adell and Artigas, 1991; Hjorth and Auerbach, 1994). Higher doses hypothetically bypass this inhibitory mechanism, allowing for a net increase in prefrontal 5-HT extracellular content (Bymaster et al, 2002; Invernizzi et al, 1992; Koch et al, 2002; Pozzi et al, 1999). Moreover, biphasic dose–response curves have been observed after citalopram administration in a variety of tests (Dekeyne et al, 2000; Hashimoto et al, 2009; Sanchez and Meier, 1997). The exact mechanisms underlying the opposite effects of different doses of citalopram in the PRL task, however, remain unclear. We cannot exclude, for example, that they are due to the recruitment of different receptor subtypes with varying affinity for the ligand or to the involvement of neurotransmitter systems other than 5-HT.

The results of experiment 1 show that increasing 5-HT transmission by a high dose of citalopram promotes behavioral flexibility and provides resilience to negative incontrollable events (ie, decreased lose-shift behavior) probably through a forebrain 5-HT1a-dependent mechanism (Deakin and Graeff, 1991). Conversely, a low dose elicits the opposite effect, increasing the impact of negative feedback on the subsequent choice, presumably via temporary silencing of 5-HT system activity through stimulation of 5-HT1a autoreceptors in the raphé nuclei (Sharp et al, 1989; Sprouse and Aghajanian, 1987). This latter effect would hypothetically increase the impact of phasic signaling on 5-HT2/1c receptors mediating the avoidance reaction to negative feedback, consistent with the formulation of Deakin and Graeff (1991).

Regardless of the direction of the effects elicited by different doses of citalopram after acute administration, only lose-shift (and not win-stay) behavior was affected by these treatments. This concurs with some studies in human subjects that found acute 5-HT manipulations selectively to affect reactions to negative stimuli (Browning et al, 2007; Chamberlain et al, 2006), punishment prediction (Cools et al, 2008b), and the activation of brain areas involved in error detection and processing of aversive events (Anderson et al, 2007; Del-Ben et al, 2005; Evers et al, 2005; McKie et al, 2005).

We also found that repeated administration of citalopram (5 mg/kg daily, 30 min before test sessions) enhanced win-stay behavior in rats performing the PRL task. Furthermore, enhancing 5-HT neurotransmission by subchronic citalopram administration (10 mg/kg, b.i.d.) increased the number of reversals completed and a subsequent challenge with a low dose of the drug (3 mg/kg), administered 30 min before the test session, augmented both the number of reversals completed and win-stay behavior, but only in animals pretreated subchronically with citalopram. These results show that elevating 5-HT tone causes large improvements in performance and a selective increase in win-stay behavior, similarly to the beneficial effects of boosting 5-HT neurotransmission by repeated SSRI administration in depressive patients. Citalopram-treated animals showed heightened reward sensitivity and returned to choose the option that had been rewarded on the immediately preceding trial more often than controls. This latter effect probably requires the drug to be present at the time of testing, because it disappeared when citalopram (10 mg/kg, b.i.d.) was administered a few hours before and after testing sessions.

An effect opposite to the one caused by repeated citalopram administration resulted from 5-HT depletion by i.c.v. 5,7-DHT infusion. 5-HT-depleted animals achieved fewer reversals than sham-operated rats, showing an earlier increase in lose-shift behavior (a difference disappearing over sessions) and a long-lasting decrease in win-stay behavior.

Depleted animals were also faster to respond to the apertures, an effect that has been already observed after the same lesion technique (Harrison et al, 1997; Winstanley et al, 2004). However, there was no difference in the latency to collect the reward between the two groups, which is a sensitive measure of motivation for food (Carli and Samanin, 1992; Harrison et al, 1997) and, together with the decrease in reward sensitivity (ie, lower win-stay), largely excludes a motivational explanation of the increased speed in lesioned animals.

Thus, in the context of the PRL task, long-term enhancement of 5-HT neurotransmission, in addition to improving behavioral flexibility, selectively increases reward sensitivity, whereas chronic 5-HT depletion produces the opposite effect, making the animals less sensitive to positive feedback. The transient nature of the increase in lose-shift behavior after 5-HT depletion might have been a consequence of functional adaptations and/or recovery in the brain of lesioned animals or, less likely, of incomplete degeneration of 5-HT-containing neurons during the first part of the experiment (see Baumgarten et al, 1974).

From a neurobiological perspective, the effects observed after acute, as distinct from long-term, 5-HT manipulations could be ascribed to the contribution of 5-HT receptors that downregulate after repeated SSRI treatment. From a theoretical point of view, a possible explanation is suggested by computational theories of the role of tonic fluctuations in 5-HT brain levels in reporting a long-term average reward prediction signal and of phasic 5-HT signals in negative prediction error as opposed to the role of the dopaminergic system (Daw et al, 2002).

Implications for the Hypothetical role of 5-HT in Punishment Prediction

Serotoninergic neurotransmission has long been hypothesized to have an important role in the control of an aversive motivational system (Deakin, 1983, 1991) and in the pathophysiology of depressive states (Delgado et al, 1990; Shopsin et al, 1976). Although DA has been associated with the expression of an appetitive reward system (Schultz et al, 1997), it is plausible that it works in mutual opponency with a system that signals the prediction of punishment instead of reward (Daw et al, 2002; Deakin, 1991).

As pointed out in a recent review (Cools et al, 2008a), the role of 5-HT in the anticipation of punishment seems in contradiction with the observations that decreased brain 5-HT levels—by experimental manipulations or as a consequence of hypothesized reductions in psychiatric disorders such as depression—actually enhance the impact of negative feedback on subsequent performance. However, lower extracellular levels of tonic neurotransmission might theoretically increase ‘signal-to-noise ratios’ when the serotoninergic neurons fire to signal the prediction of punishment, similarly to the mechanism of action of DA in signaling reward receipt or its anticipation (Frank et al, 2004; Grace, 1991; Schultz et al, 2000). The opposite would be true with higher tonic levels of 5-HT, which are predicted to dampen the impact of this signal (Cools et al, 2008a). This framework is consistent, at least in some aspects, with recent findings in humans and with the results presented here. Indeed, decreasing the activity of 5-HT system by acute administration of low doses of SSRI in humans (Chamberlain et al, 2006) and in rats (this report), has similar effects to that caused by ATD (Evers et al, 2005), namely increasing the impact of negative feedback. This mechanism is also potentially able to explain the exaggerated reaction to punishment in depressed patients (Elliott et al, 1997; Murphy et al, 2003).

In this report, depleting 5-HT brain content caused only a transient increase in lose-shift. This is probably because, following such a profound manipulation, 5-HT neurons are not able to send reliable punishment signals as a consequence of the almost complete depletion of the neurotransmitter and/or 5-HT fiber degeneration, which happens gradually after the toxin has been infused (Baumgarten et al, 1974). However, in the first part of the experiment (Figure 4b), the increase in lose-shift behavior is strikingly similar to the impairment observed in depressed patients (ie, increased lose-shift). The effect of the repeated citalopram treatment is consistent with the recovery of depressive symptoms after prolonged SSRIs treatment, but, argues against the framework previously described for the role of 5-HT in signaling punishment. In our case, we would have expected that elevating tonic levels of 5-HT would dampen the punishment signal, leading to an effect similar to an acute high dose of citalopram namely, to decrease lose-shift behavior. We did not observe such effects, probably because of more complex interactions and adaptations within and outside the 5-HT system.

The increase in win-stay behavior caused by repeated citalopram administration could be a consequence of alterations in the coupling of the 5-HT system with dopamine, which is more related to reward sensitivity (Nestler and Carlezon, 2006). In fact, repeated SSRI administration is known to modulate the firing rate of mesocorticolimbic DA neurons (Bortolozzi et al, 2005; Di Mascio et al, 1998; Dremencov et al, 2009; Sekine et al, 2007) through long-term adaptations of 5-HT2 receptors (Blackburn et al, 2006; Esposito, 2006; Pehek et al, 2006). However, more experiments are needed to define the precise role of 5-HT–DA interactions in the sensitivity to positive and negative feedback.

Summary

The study results show that the beneficial effects of SSRIs on executive function depend on the exact dose and dosing regimen used. Increasing and decreasing central 5-HT levels exert opposite effects on reward and NFS, resembling the effects of manipulation of 5-HT levels in healthy and depressed subjects on similar tasks. However, it was observed for the first time in rats that acute manipulations of the 5-HT system affect principally NFS, whereas long-lasting treatments significantly modulate reward (positive feedback) sensitivity in the PRL task.