Balancing Risk and Reward: A Rat Model of Risky Decision Making

Simon, Nicholas W; Gilbert, Ryan J; Mayse, Jeffrey D; Bizon, Jennifer L; Setlow, Barry

doi:10.1038/npp.2009.48

Download PDF

Original Article
Published: 13 May 2009

Balancing Risk and Reward: A Rat Model of Risky Decision Making

Nicholas W Simon¹,
Ryan J Gilbert¹,
Jeffrey D Mayse¹,
Jennifer L Bizon^1,2 &
…
Barry Setlow^1,2

Neuropsychopharmacology volume 34, pages 2208–2217 (2009)Cite this article

4856 Accesses
116 Citations
1 Altmetric
Metrics details

Abstract

We developed a behavioral task in rats to assess the influence of risk of punishment on decision making. Male Long–Evans rats were given choices between pressing a lever to obtain a small, ‘safe’ food reward and a large food reward associated with risk of punishment (footshock). Each test session consisted of 5 blocks of 10 choice trials, with punishment risk increasing with each consecutive block (0, 25, 50, 75, 100%). Preference for the large, ‘risky’ reward declined with both increased probability and increased magnitude of punishment, and reward choice was not affected by the level of satiation or the order of risk presentation. Performance in this risky decision-making task was correlated with the degree to which the rats discounted the value of probabilistic rewards, but not delayed rewards. Finally, the acute effects of different doses of amphetamine and cocaine on risky decision making were assessed. Systemic amphetamine administration caused a dose-dependent decrease in choice of the large risky reward (ie, it made rats more risk averse). Cocaine did not cause a shift in reward choice, but instead impaired the rats’ sensitivity to changes in punishment risk. These results should prove useful for investigating neuropsychiatric disorders in which risk taking is a prominent feature, such as attention deficit/hyperactivity disorder and addiction.

Risky decision-making predicts dopamine release dynamics in nucleus accumbens shell

Article 23 September 2019

Timothy G. Freels, Daniel B. K. Gabriel, … Nicholas W. Simon

Shifting uncertainty intolerance: methylphenidate and attention-deficit hyperactivity disorder

Article Open access 06 January 2021

Alekhya Mandali, Arjun Sethi, … Valerie Voon

Medial orbitofrontal cortex and nucleus accumbens mediation in risk assessment behaviors in adolescents and adults

Article 17 January 2022

Maxine K. Loh, Nicole C. Ferrara, … J. Amiel Rosenkranz

INTRODUCTION

Few decisions with which individuals are faced on a daily basis are entirely without risk of adverse consequences. Even everyday decisions such as ‘what to do when faced with a yellow traffic signal while driving’ (accelerate or slow down) involve consideration of both the risks and rewards associated with each option. A better understanding of this type of decision making under conditions in which highly rewarding choices are accompanied by risks of adverse consequences (punishment) may have considerable implications for understanding psychopathological conditions characterized by alterations in risk-based decision making, such as attention deficit/hyperactivity disorder, major depressive disorder, schizophrenia, and addiction (Bechara et al, 2001; Drechsler et al, 2008; Ernst et al, 2003; Heerey et al, 2008; Taylor Tavares et al, 2007). Such risky decision making is commonly studied in human subjects (Lejuez et al, 2002; Leland and Paulus, 2005); however, there are few animal models that systematically assess the degree to which risk of punishment (rather than reward omission) influences reward-based choice (see Negus, 2005).

There were three main goals of the experiments described below: the first was to establish the performance parameters of a discrete-trials risky decision-making choice task, in which rats chose between a small, ‘safe’ reward and a large reward associated with a risk of punishment that varied within each test session. The second goal was to determine how choice performance in the risky decision-making task was related to performance in other decision-making tasks that assess the degree to which choices are influenced by reward probability and delays to reward delivery (and which are also commonly altered in the same neuropsychiatric conditions in which altered risky decision making is observed). The third goal was to determine how risky decision making is affected by acute administration of psychostimulant drugs that have been shown previously to affect delay- and probability-based decision making in both human and animal subjects (Cardinal et al, 2000; De Wit et al, 2002; DeVito et al, 2008; Evenden and Ryan, 1996; St Onge and Floresco, 2009; Stanis et al, 2008; Winstanley et al, 2007).

METHODS

Subjects

Male Long–Evans rats (n=28, weighing 275–300 g on arrival; Charles River Laboratories, Raleigh, NC) were individually housed and kept on a 12 h light/dark cycle (lights on at 0800 hours) with free access to food and water except as noted. During testing, rats were maintained at 85% of their free-feeding weight, with allowances for growth. All procedures were conducted in accordance with the Texas A&M University Laboratory Animal Care and Use Committee and NIH guidelines.

Apparatus

Testing took place in standard behavioral test chambers (Coulbourn Instruments, Whitehall, PA) housed within sound attenuating cubicles. Each chamber was equipped with a recessed food pellet delivery trough fitted with a photobeam to detect head entries and a 1.12 W lamp to illuminate the food trough. The trough, into which 45 mg grain-based food pellets (PJAI, Test Diet: Richmond, IN) were delivered, was located 2 cm above the floor in the center of the front wall. Two retractable levers were located to the left and right of the food delivery trough, 11 cm above the floor. A 1.12 W house light was mounted on the rear wall of the sound-attenuating cubicle. The floor of the test chamber was composed of steel rods connected to a shock generator (Coulbourn) that delivered scrambled footshocks. Locomotor activity was assessed throughout each session with an overhead infrared activity monitor. Test chambers were interfaced with a computer running Graphic State software (Coulbourn), which controlled task event delivery and data collection.

Behavioral Procedures

Shaping

Shaping procedures were identical to those used previously (Simon et al, 2007b). Following magazine training, rats were trained to press a single lever (either left or right, counterbalanced across groups; the other was retracted during this phase of training) to receive a single food pellet. After reaching a criterion of 50 lever presses in 30 min, rats were shaped to press the opposite lever under the same criterion. This was followed by further shaping sessions in which both levers were retracted and rats were shaped to nose poke into the food trough during simultaneous illumination of the trough and house lights. When a nose poke occurred, a single lever was extended (left or right), and a lever press resulted in immediate delivery of a single food pellet. Immediately following the lever press, the house and trough lights were extinguished and the lever was retracted. Rats were trained to a criterion of at least 30 presses of each lever in 60 min. This shaping procedure was conducted only once at the start of all behavioral testing.

Risky decision-making task

Test sessions were 60 min long and consisted of 5 blocks of 18 trials each. Each 40 s trial began with a 10 s illumination of the food trough and house lights. A nose poke into the food trough during this time extinguished the food trough light and triggered extension of either a single lever (forced choice trials) or of both levers simultaneously (choice trials). If the rats failed to nose poke within the 10 s time window, the lights were extinguished and the trial was scored as an omission.

A press on one lever (either left or right, balanced across animals) resulted in one food pellet (the small, safe reward) delivered immediately following the lever press. A press on the other lever resulted in immediate delivery of three food pellets (the large reward). However, selection of this lever was also accompanied immediately by a possible 1 s footshock contingent on a preset probability specific to each trial block. The large food pellet reward was delivered following every choice of the large reward lever, regardless of whether the footshock occurred. The intensity of the footshock varied by experiment (see below). With the exception of Experiment 1C, the probability of footshock accompanying the large reward was set at 0% during the first 18-trial block. In subsequent 18-trial blocks, the probability of footshock increased to 25, 50, 75, and 100%. Each 18-trial block began with 8 forced choice trials used to establish the punishment contingencies (4 for each lever), followed by 10 choice trials (Cardinal and Howes, 2005; Simon et al, 2007b; St Onge and Floresco, 2009). Once either lever was pressed, both levers were immediately retracted. Food delivery was accompanied by reillumination of both the food trough and house lights, which were extinguished on entry to the food trough to collect the food or after 10 s, whichever occurred sooner.

Delay-discounting task

A detailed description of this procedure is provided in Simon et al (2007b). Each 100 min session consisted of 5 blocks of 12 trials each. Each 100 s trial began with illumination of the food trough and house lights. A nose poke into the food trough during this time extinguished the food trough light and triggered extension of either a single lever (forced choice trials) or of both levers simultaneously (choice trials). Trials on which rats failed to nose poke during this window were scored as omissions.

Each block consisted of 2 forced choice trials followed by 10 choice trials. A press on one lever (either left or right, counterbalanced across subjects) resulted in one food pellet delivered immediately. A press on the other lever resulted in four food pellets delivered after a variable delay. Once either lever was pressed, both levers were retracted for the remainder of the trial. The delay duration increased with each block of trials (0 s, 10 s, 20 s, 40 s, 60 s; Cardinal et al, 2000; Evenden and Ryan, 1996; Simon et al, 2007b; Winstanley et al, 2003).

Probability-discounting task

The parameters of this task were identical to the risky decision-making task, with the only difference following selection of the large reward lever. During the first block of trials, the large reward was delivered with 100% probability. During each of the four subsequent blocks, the probability of large reward delivery was systematically decreased (75, 50, 25, 0%). The large reward was accompanied by neither punishment nor a delay period.

Experiment 1: Establishing The Risky Decision-Making Task

Experiment 1A: Effects of shock intensity

To determine the optimal shock intensity for subsequent experiments, rats were divided into three groups and tested on the risky decision-making task for 20 sessions at shock intensities of 0.35 mA (n=12), 0.4 mA (n=10), or 0.45 mA (n=6). Percent choice of the large reward lever was averaged across the final five days of testing, during which performance was stable.

Experiment 1B: Effects of food satiation level

To assess the effects of alterations in food motivation on performance, rats trained at the 0.35 mA shock intensity were tested after being given access to freely available food in their home cage for either 1 or 24 h immediately before testing. This testing occurred over 4 days for each satiation level, with rats food restricted as normal (85% of free-feeding weight) on days 1 and 3, and with access to food on days 2 and 4. For each satiation level, the two satiation days and the two nonsatiation days were averaged together for analysis.

Experiment 1C: Reversal of punishment probabilities

To determine whether task performance was specific to ascending risks of punishment, rats trained with a shock intensity of 0.35 mA were tested for 10 sessions in a modified version of the task in which the order of risk presentations accompanying the large reward was reversed (100% in the first block of trials, followed by 75, 50, 25, and 0%). All other aspects of the task remained constant. Choice behavior was averaged across the final five sessions.

Experiment 2: Comparing Decision Making Across Tasks

Following testing in the risky decision-making task, rats tested with a shock intensity of 0.35 mA were tested in the delay-discounting task for 20 sessions, followed by testing in the probability-discounting task for 15 sessions. In each task, performance was assessed by averaging across the final five sessions (during which performance was stable).

Experiment 3: Acute Drug Treatments

The acute effects of d-amphetamine sulfate and cocaine hydrochloride on risky decision making were examined in rats tested at a shock intensity of 0.35 mA. Rats were tested following i.p. injections (1 ml/kg) of one of three doses of d-amphetamine sulfate (Sigma, St Louis, MO; 0.33, 1.0, 1.5 mg/kg) or 0.9% saline vehicle. Injections were administered before testing over a period of 6 days using the following schedule: saline, amphetamine dose 1, saline, amphetamine dose 2, saline, amphetamine dose 3. The order of drug doses was counterbalanced across subjects. This 6-day experimental procedure was later repeated (after stable performance was obtained) using cocaine hydrochloride (Drug Supply Program, NIDA; 5, 10, 15 mg/kg in 0.9% saline vehicle). All drug treatments were administered in the vivarium; between transportation to the test chambers and the initial 5 min of forced-choice trials, 10 min elapsed between drug administration and collection of choice preference data in the first block of trials.

Timeline of Experiments

Rats in the 0.35 mA shock-intensity group from Experiment 1 were the subjects in Experiments 2 and 3. The progression of tasks and treatments was as follows: risky decision making (RDM)/amphetamine treatment/baseline RDM/delay-discounting/probability-discounting/baseline RDM/cocaine treatment/baseline RDM/RDM satiation tests/baseline RDM/RDM reversed probabilities.

Data Analysis

Raw data files were exported from Graphic State software and compiled using a custom macro written for Microsoft Excel (Dr. Jonathan Lifshitz, University of Kentucky). Statistical analyses were conducted in SPSS 16.0. Stable behavior was defined by the absence of either a main effect of session or an interaction between session and trial block in a repeated measures ANOVA over a five-session period (Cardinal et al, 2000; Simon et al, 2008b; Winstanley et al, 2006b). The effects of behavioral or pharmacological manipulations in all tasks were assessed using two-way ANOVAs, with trial block (ie, level of risk) as a repeated measures variable. Performance between tasks was compared using bivariate Pearson's correlations. In all cases, p<0.05 were considered significant.

RESULTS

Experiment 1: Establishing The Risky Decision-Making Task

Experiment 1A: Effects of shock intensity

Performance was stable for rats in all three shock-intensity groups in sessions 16–20. A two-way repeated measures ANOVA (shock intensity (0.35, 0.40, 0.45 mA) × risk (0, 25, 50, 75, 100%)) revealed main effects of both shock intensity (F_(2,24)=12.31, p<0.001) and risk (F_(4,96)=14.75, p<0.001), with choice of the large reward decreasing with both shock intensity and risk in a combined analysis of all three groups (Figure 1). There was no interaction between shock-intensity groups (F_(8,96)=1.75, n.s.); however, LSD post hoc analyses revealed significant differences in reward choice between the three groups (p<0.01). Additional planned one-way repeated measures ANOVAs showed main effects of punishment risk in rats trained with the 0.35 mA shock (F_(4,44)=7.20_, p<0.001) and the 0.4 mA shock (F_(4,32)=12.18 p<.001), indicating that these groups discounted the large reward as a function of risk (ie, they were sensitive to risk of punishment). There was no effect of punishment risk for the 0.45 mA shock intensity (F_(4,20),=1.00, n.s.), with this group demonstrating almost complete preference for the small reward across all blocks (Figure 1).

Additionally, there was no main effect of shock intensity on the number of choice trials completed during testing (mean % completed trials, 0.35 mA, 97.00; 0.40 mA, 96.62, 0.45 mA, 99.93; F_(2,24)=1.52, n.s.), although there was a main effect of shock intensity on trials completed during forced choice trials with the large lever (with no option for the small, safe reward) when the punishment risk was 100% (mean % completed trials: 0.35 mA: 75.00, 0.40 mA: 53.89, 0.45 mA: 5.83; F_(2,24)=14.54, p <0.001). LSD post hoc analyses conducted on these latter data revealed no difference in completed trials between the 0.35 and 0.40 mA groups (n.s.) but strong group differences between the 0.35 and 0.40 groups and the 0.45 mA group (p<0.001), indicating that rats would often forego selection of the large reward altogether at the high shock intensity when there was no safe reward option.

There was considerable variance in reward preference within both the 0.35 and 0.4 mA shock groups, such that some rats demonstrated a strong preference for the large, risky reward whereas others preferred the small, safe reward (Figure 2). To analyze the source of this variance, linear regressions were used to assess the ability of initial reward preference (during the first, 0% risk block) to predict reward choice averaged across the other four blocks. Reward choice during the 0% block was correlated with choice during the next four blocks for both the 0.35 (r=0.86, p<0.001) and 0.40 mA (r=0.82, p<0.01) shock-intensity groups. For the 0.35 mA shock group, there were no correlations between reward preference (risk taking) and baseline body weight (r=0.33, n.s.) or shock reactivity as assessed by locomotion during the 1 s shock-delivery period (r=−0.15, n.s.). For the 0.4 mA shock group, there was a strong correlation between baseline body weight and reward preference (r=0.82, p <0.05) but not between reward preference and shock reactivity (r=−0.13, n.s.) or body weight and shock reactivity (r=−0.08, n.s.).

Based on the data from Experiment 1A, the 0.35 mA shock intensity was used for all subsequent experiments.

Experiment 1B: Effects of food satiation level

For the 1 h satiation condition (Figure 3a), a two-way repeated measures ANOVA showed a main effect of punishment risk (F_(4,40)=7.72, p<0.001) but no main effect of satiation (F_(1,40)=0.68, n.s.), indicating that 1 h of free feeding before testing did not influence choice behavior. There was also no effect of 1 h satiation on the number of trials omitted during testing (F_(1,10)=1.61, n.s.).

The effects of 24 h of free feeding on choice performance were similar, in that there was a main effect of punishment risk (F_(4,36)=5.82, p<0.01) but no main effect of satiation level (F_(1,9)=3.19, n.s.), indicating that even substantial satiation caused no change in reward choice (Figure 3b). However, rats completed significantly fewer trials after 24 h of free feeding than under food-restricted conditions (mean percentage of completed trials; satiated, 67.80 vs restricted, 79.50; F_(1,9)=9.48, p<0.05).

Experiment 1C: Reversal of punishment probabilities

After the order of presentation of punishment risks was reversed, there was still a main effect of risk (F_(4,36)=7.59, p<0.001), with rats showing less preference for the large reward under conditions of greater risk. These data were consistent with data from Experiment 1A, as a within-subjects comparison of performance under ascending and descending orders of risk presentation revealed no main effects or interactions involving the order of risk presentation (F<1.45, n.s.) (Figure 4).