Visual attention can be shifted covertly to objects and locations in the absence of eye movements (e.g., Posner 1980), facilitating perceptual processing of selected stimuli. Although covert attention can be deployed voluntarily in accordance with ongoing perceptual goals, certain kinds of stimuli capture attention involuntarily. These include physically salient stimuli, which produce salience-driven attentional capture (e.g., Theeuwes 1992; Yantis & Jonides, 1984), and stimuli expressing goal-related features, which produce contingent attentional capture (e.g., Anderson & Folk, 2010; Folk, Remington, & Johnston, 1992). Recently, we reported that stimuli imbued with value via reward learning also capture attention, independently of their salience and ongoing task goals; we call this value-driven attentional capture (Anderson, Laurent, & Yantis, 2011a, b, 2012).

In the present study, we investigated whether previously rewarded stimuli also draw the eyes. Shifts of covert attention are known to precede eye movements (Hoffman & Subramaniam, 1995), but attending to a stimulus covertly does not necessarily result in the initiation of an eye movement (e.g., Posner 1980; Thompson & Bichot, 2005). That is to say, covert attention to a stimulus is necessary but not sufficient for the execution of an eye movement to its position. Nevertheless, task-irrelevant stimuli that capture attention also tend to draw eye movements, an effect termed oculomotor capture (e.g., Ludwig & Gilchrist, 2002, 2003; Theeuwes, de Vries, & Godijn, 2003; Van der Stigchel & Theeuwes, 2005).

Although salience-driven and contingent capture have been well characterized in both the attentional and the oculomotor domains (e.g., Folk et al., 1992; Ludwig & Gilchrist, 2002, 2003; Theeuwes, 1992; Theeuwes et al., 2003), the influence of reward learning on oculomotor selection is only beginning to be understood. Recent evidence demonstrates that pairing a stimulus with high reward on a given trial involuntarily primes eye movements toward or away from that stimulus on the following trial (Hickey & van Zoest, 2012), mirroring previous findings concerning reward-modulated priming and covert attention (Della Libera & Chelazzi, 2006, Hickey, Chelazzi, & Theeuwes, 2010a, b, 2011). However, the ability to reliably predict the availability of reward on the basis of learned stimulus–reward associations is a fundamental aspect of adaptive behavior, and the effect that the learning of these associations has on oculomotor control is unknown. One goal of the present study was to investigate the degree to which previously reward-predictive stimuli (we refer to these as high-value stimuli) involuntarily and persistently draw the eyes when those stimuli are no longer rewarded, not task-relevant, and not physically salient.

It has been hypothesized that visual working memory (VWM) capacity—the amount of visual information that can be remembered over a brief interval—depends in part on susceptibility to distraction: Individuals who are less able to filter out irrelevant information in both VWM and visual search tasks also tend to have a lower VWM capacity. For example, Fukuda and Vogel (2009, 2011) found that VWM capacity is negatively correlated with the magnitude of response time (RT) slowing in contingent attentional capture. Furthermore, we have recently shown that VWM capacity is also negatively correlated with the magnitude of value-driven attentional capture (Anderson et al., 2011b). In the present study, we sought to extend the relationship between VWM capacity and distraction by measuring how VWM capacity covaries with the magnitude of oculomotor capture produced by high-value distractors.

Oculomotor capture has almost always been assessed under conditions in which participants were required to make eye movements to a prespecified target item (e.g., Hickey & van Zoest, 2012; Ludwig & Gilchrist, 2002, 2003; Theeuwes et al., 2003; Van der Stigchel & Theeuwes, 2005). Thus, it is not known whether an attentional bias toward a particular stimulus (e.g., one with high value) can itself trigger an eye movement to that stimulus even when an eye movement is not required by the task. This is because when eye movements are required in order to carry out the task, any observed tendency to move the eyes to an irrelevant stimulus may reflect a directional bias that becomes manifest only when participants voluntarily initiate an eye movement. In the present study, we sought to investigate the influence of reward history on eye movements that were not constrained in this way: Participants could perform the task without eye movements, but we did not require them to remain fixated either. This paradigm thus provides evidence about eye movements under conditions that are more naturalistic than those that are typically investigated. Evidence for oculomotor capture under these conditions would suggest that people tend to look at otherwise irrelevant stimuli that have previously been paired with reward.

The delivery of reward produces the release of dopamine in the striatum of the basal ganglia; over the course of learning, stimuli that predict reward come to elicit this dopamine response (Schultz, Dayan, & Montague, 1997). The transfer of the dopamine response from the reward itself to a reward-predicting stimulus occurs gradually as the predictive relationship is learned (Waelti, Dickinson, & Schultz, 2001). Changes in pupil diameter provide an index of the arousal that accompanies the delivery of reward (Bradley, Miccoli, Escrig, & Lang, 2008; Steinhauer, Siegle, Condray, & Pless, 2004). As reward predictions develop during learning, they come to be accompanied by anticipatory arousal that produces easily measurable dilations of the pupils (e.g., O'Doherty, Buchanan, Seymour, & Dolan, 2006). We therefore measured pupil dilation to targets during reward learning, in order to assess whether the stimuli that elicit value-driven attentional capture come to serve as learned predictive cues for reward. An increase in pupil dilation to the reward-predictive targets throughout the course of the training phase would provide converging evidence for a reward-learning account of any observed effects of distraction in the test phase.

Experiment 1

Experiment 1 was modeled closely after the paradigm of Anderson et al. (2011b). The experiment began with a training phase in which participants received monetary reward feedback following correct responses to color targets (Fig. 1a). One of two target colors was associated with a greater probability of a large reward, while the other was associated with a greater probability of a small reward; the magnitude of reward was not associated with a particular motor response. A test phase followed the training phase, in which participants searched for a shape singleton target in extinction (i.e., with no reward feedback provided), while previously reward-related stimuli occasionally appeared as task-irrelevant distractors (Fig. 1b). Several previous experiments using this paradigm have demonstrated that behavioral impairment caused by the formerly rewarded distractors depends critically upon prior reward learning and is not present when reward feedback is not provided during the training phase (Anderson et al., 2011a, b). Eye position and pupil diameter were monitored throughout the experiment.

Fig. 1
figure 1

Sequence of events and time course for a trial during training (a) and at test (b) in Experiment 1. Targets were defined as the red or green circle during training and as the unique shape at test (either a circle among diamonds or a diamond among circles). Participants reported the identity of an oriented bar contained within the target stimulus. Monetary reward was delivered in the training phase, which varied probabilistically with the color of the target. The valuable distractor in the test phase consisted of a nontarget stimulus rendered in the color of a formerly reward-predictive target

Method

Participants

Fifteen participants were recruited from the Johns Hopkins University community. All were screened for normal or corrected-to-normal visual acuity and color vision. Participants were provided monetary compensation on the basis of performance that ranged from $23 to $27 (M = $25.53). Eye position could not be calibrated for 1 participant, so all eye-tracking results include 14 participants.

Apparatus

A Mac Mini equipped with MATLAB software and Psychophysics Toolbox extensions was used to present the stimuli on a Dell P991 monitor. The participants viewed the monitor from a distance of 75 cm in a dimly lit room, using a chinrest. Manual responses were entered by participants using a standard 101-key US layout keyboard. Eye tracking and pupillometry were performed using an EyeLink 1000 system.

Measuring visual working memory capacity

Prior to completing the visual search tasks, all participants completed a standard change detection task designed to measure VWM capacity. The method for the change detection task has been described previously (Anderson et al., 2011b; Fukuda & Vogel, 2009; Luck & Vogel, 1997). Participants were presented with an array of four, six, or eight colored squares for 100 ms, followed by a 900-ms retention interval and then by the presentation of a single colored square that was either the same color as or different in color from the square that had previously occupied its position. Participants reported whether the test square was the same color or different in color with an unspeeded keypress, and VWM capacity was calculated on the basis of response accuracy using a standard equation that accounts for the probability of guessing correctly (e.g., Fukuda & Vogel, 2009; Luck & Vogel, 1997).

Experimental task

Training phase

The sequence and timing of events in the training phase is shown in Fig. 1a. Each trial consisted of a fixation display for 2,000 ms, a search display for 1,000 ms, a blank screen for either 1,000 or 3,000 ms (equally often), a reward feedback display for 1,500 ms, and a blank intertrial interval for 500, 2,500, or 4,500 ms (exponentially distributed, with 500 ms occurring most often). Targets in the training phase were defined as the red or green circle among differently colored circles (blue, cyan, pink, orange, yellow, or white); exactly one target was present on each trial. The six circles were 2.5° of visual angle in diameter and were placed 6.0° center-to-center from fixation. The training phase consisted of five blocks of 60 trials, in which each target color appeared in each of the six stimulus positions equally often across trials. Participants reported whether an oriented bar within the target stimulus was either vertical or horizontal by pressing the "z" and "m" keys, respectively. Correct responses were followed by monetary reward feedback, which varied probabilistically with the color of the target. One target color was associated with an 80 % probability of a high reward of 15¢ and a 20 % probability of a low reward of 3¢; for the other target color, this mapping was reversed. Training thus imbued one color with a high value and the other with a lower (but positive) value. Half of the participants experienced red as the high-reward target and green as the low-reward target, and for the other half this mapping was reversed. If participants responded incorrectly, the feedback display informed them that they had received 0¢. The cumulative reward earned thus far was also displayed after each trial.

Test phase

The sequence and timing of events in the test phase is shown in Fig. 1b. Each trial consisted of a fixation display for 2,000 ms, a search display for 1,500 ms, and a blank intertrial interval for 500, 2,500, or 4,500 ms (exponentially distributed, with 500 ms occurring most often). Targets in the test phase were now defined as the unique shape, either a circle among diamonds or a diamond among circles (equally often and randomly ordered). Participants made the same judgment concerning the oriented bar contained within the target stimulus. The search items were differently colored, as in the training phase; however, the targets were never red or green. On a randomly selected one quarter of the trials, one of the nontarget items was rendered in red, and on another one quarter, one of the nontarget items was rendered in green; these constituted the formerly rewarded distractors (the remaining items will be referred to as nontargets). The test phase consisted of four blocks of 80 trials, in which the target appeared in each of the six stimulus positions with equal probability. When a distractor was present, it appeared in each of the five nontarget positions with equal probability. No monetary feedback was provided in the test phase; participants were informed of their accuracy for each block following the completion of that block.

Instructions

Throughout the experiment, participants were encouraged to respond with a buttonpress as quickly as possible while minimizing errors. Participants were instructed to ignore color during the test phase and to focus on identifying the line orientation within the unique shape. They were provided with written and verbal descriptions of the task and procedures prior to each phase of the experiment and were shown example displays. Participants were neither encouraged nor discouraged from moving their eyes; all they were told was that their eyes would be monitored during the task, using a camera.

Eye tracking

Eye tracking was performed at a sampling rate of 500 Hz. Nine-point calibration was used. Calibration was checked at the beginning of each block and recalibrated as necessary. Head position was maintained using a chinrest, and eye position was measured without applying online drift correction. We utilized a display in which all stimuli were presented peripherally from the center of the screen; an advantage of this mode of presentation is that eye movement directions on the x-axis would be robust to any drifts in measured eye position. Saccadic eye movements were defined as those for which velocity exceeded 30°/s and acceleration exceeded 8,000°/s. The first eye movement for each search display was measured as the first saccade exceeding 1° of visual angle that occurred at least 100 ms following the onset of the search array.

Pupil size was obtained starting 200 ms before until 2,000 ms after the onset of the search array in eleven 200-ms bins. Pupil dilation was measured as the percent change in pupil size for each bin, relative to the 200-ms predisplay baseline (as in, e.g., Laeng, Orbo, Holmlund, & Miozzo, 2011). Note that the search array involved the onset of several luminous stimuli against a black background, which would alone be expected to cause slight constrictions of pupil size. In order to avoid confounding display duration with RT while measuring changes in pupil dilation, all stimulus displays were presented for a fixed duration on every trial.

Data analysis

Manual RTs on error trials and RTs more than three standard deviations above or below the mean of their respective conditions for each participant were excluded from the RT analysis (together, this resulted in the removal of 6.5 % of the trials). Saccades occurring less than 100 ms following the onset of the search array were considered anticipations and were not included in the eye movement analysis; this resulted in the removal of fewer than 1 % of all initial saccades. Blinks were eliminated by trimming samples spanning 100 ms (50 samples) both before and after the pupils were lost by the eyetracker.

Results and discussion

Training phase

Manual response time and accuracy

We first examined evidence that high-reward targets exhibited increased attentional priority during the training phase. Participants identified high-reward targets significantly faster than low-reward targets (603 vs. 615 ms), t(14) = 2.19, p = .046, d = 0.56, and they were more accurate on trials containing a high-reward target, although not significantly so (93.5 % vs. 92.0 %), t(14) = 1.90, p = .078, d = 0.49.

Eye movements

We next assessed how eye movements were affected by the position and value of the targets during training. Eye movements were first analyzed according to which side of the visual field the first saccade was directed. The first saccade following the onset of the search display was significantly more likely to land on the side of the visual field containing the target than on the opposite side (83 % to side of target), t(13) = 8.95, p < .001, d = 2.39. Saccades were more likely to occur to the side of the visual field containing a high-reward target, as compared with a low-reward target, an effect that was marginally significant (mean difference = 4.3 %), t(13) = 2.05, p = .061, d = 0.55. Initial saccades falling within 1° of the target (41 % of all initial saccades) occurred significantly more often for high-reward targets than for low-reward targets (mean difference = 5.5 %), t(13) = 2.53, p = .025, d = 0.68.

Pupillometry

Changes in pupil diameter following the onset of the target display were measured throughout the course of the five blocks of training. The magnitude of pupil dilation following target display onset increased across blocks (Fig. 2), consistent with increasingly pronounced reward prediction and anticipation. This effect was confirmed by a significant ANOVA on mean change in pupil diameter over trial block, F(4, 52) = 7.65, p < .001, η 2p = .371, which exhibited a linear trend, F(1, 13) = 8.43, p = .012, η 2p = .393. There was a weak but nonsignificant trend toward greater pupil dilation in response to the high-reward target, as compared with the low-reward target (mean difference = 0.45 %), t(13) = 0.91, p = .378.

Fig. 2
figure 2

a Mean percent change in pupil diameter (relative to the 200-ms interval before trial onset) by trial block during the training phase of Experiment 1. b Mean percent change in pupil diameter over time by trial block in response to the target display in the training phase. The error bars reflect the within-subjects SEM (Loftus & Masson, 1994)

The presentation of the target display came to evoke pupil dilations that increased over time, suggesting increasingly effective reward prediction and anticipation. However, it is unclear from this analysis whether the observed effects were driven by the presentation of a reward-associated stimulus or by the execution and monitoring of a motor response. This is because motor responses were executed during the measured 2,000-ms interval and were necessary to obtain reward. However, the effect of trial block on pupil dilation was evident as early as 200–400 ms following the presentation of the target display, well before the average RT of over 600 ms, F(4, 52) = 3.07, p = .024, η 2p = .191 [linear trend: F(1, 13) = 5.18, p = .041, η 2p = .285]. This shows that the observed increase in pupil dilation over trial block was influenced specifically by the presentation of the target stimulus, although response-related effects may have contributed to the overall result at later time points. Together, these results provide evidence that participants learned the association between the color targets and reward during the training phase.

Test phase

Manual response time and accuracy

We first examined evidence that the previously rewarded distractors captured covert attention. Planned t-tests were performed on the manual RT data in order to compare them with our previous findings (Anderson et al., 2011b). Participants were significantly slower to respond to the target line orientation contained in the shape singleton when a high-value distractor was present, as compared with when no distractor was present (Fig. 3a), t(14) = 2.34, p = .035, d = 0.60; the low-value distractor produced an intermediate degree of slowing, t(14) = 1.83, p = .088, d = 0.47. This replicates our previous demonstration of value-driven attentional capture (Anderson et al., 2011b). A value-driven impairment in performance was evident for the high-value distractor in accuracy as well, with the low-value distractor again producing an intermediate degree of impairment (Fig. 3b), t(14) = 3.15, p = .007, d = 0.81, and t(14) = 1.07, p = .302, d = 0.28, respectively. In neither case did the high-value distractor impair performance significantly more than the low-value distractor (both ps > .05).

Fig. 3
figure 3

Manual response time (a) and accuracy (b) by condition during the test phase of Experiment 1. The error bars reflect the within-subjects SEM. *p < .05; **p < .01

Eye movements

Overall, participants moved their eyes from fixation on 88 % of the trials in the test phase. However, there were substantial individual differences in the number of trials on which a saccade occurred; some participants moved their eyes on as many as 100 % of the trials, and others on as few as 63 % of the trials. The occurrence of saccadic eye movements was generally associated with poorer task performance: Participants who made fewer initial saccades tended to respond faster, r = .553, p = .040, and more accurately, r = −.571, p = .033, than did those who made more frequent initial saccades. Across all trials, participants produced slower and less accurate manual responses on trials on which they moved their eyes, as compared with when they remained fixated [mean RT difference = 112 ms, t(760.9) = 14.61, p < .001, d = 0.58 (t-test corrected for inhomogeneity of variance); mean accuracy difference = 3.1 %, χ 2(1) = 8.31, p = .004, φ = .043]. This outcome shows that eye movements were not required to perform the task well. Participants were no more likely to break fixation when a formerly rewarded distractor was present than when it was absent (mean difference < 1 trial), t(13) = 0.52, p = .612.

Trials on which a saccade occurred were analyzed in order to assess how these movements were influenced by formerly rewarded distractors. The direction of the first saccade on each trial was analyzed according to whether a distractor was present and, if so, whether it appeared on the same side of the display as the target or not [the effects reported below did not depend on whether the high-reward target color was red or green, F(4, 48) = 1.06, p = .385, so further analyses were collapsed across this variable]. The side of the display to which an initial saccade was directed was influenced by the presence and relative location of a formerly rewarded distractor (Fig. 4a), as indicated by a significant ANOVA, F(4, 52) = 11.79, p < .001, η 2p = .476.

Fig. 4
figure 4

a Percentage of initial eye movements to the side of the visual field containing the target by distractor condition in Experiment 1. b Scatterplot of visual working memory capacity versus value-driven oculomotor capture (difference in percentage of initial eye movements toward the side of the display opposite the target on trials with a high-value distractor opposite the target vs. on trials with no distractor). The error bars reflect the within-subjects SEM. *p < .05; ***p < .001

We focused our next analyses specifically on trials on which the distractor and target were presented on opposite sides of the visual field and, thus, competed for the direction of a saccade. Post hoc contrasts revealed that participants were more likely to make an initial saccade to the side of the visual field opposite the target when a high-value distractor was present on that side, as compared with when it was absent, t(13) = 5.57, p < .001, d = 1.49, while the low-value distractor produced an intermediate level of such oculomotor impairment, t(13) = 1.87, p = .084, d = 0.50; this is an indication of value-driven oculomotor capture. Even in the last block of trials, the decrement in saccadic accuracy caused by the high- and low-value distractors was still evident, t(13) = 2.48, p = .028, d = 0.66, indicating that value-driven oculomotor capture was persistent. The behavioral impairment in RT caused by the formerly rewarded distractors was still evident when participants did not move their eyes to the side of the visual field containing the distractor, t(13) = 2.33, p = .037, d = 0.63, indicating that value-driven attentional capture does not necessarily result in value-driven oculomotor capture. However, value-driven oculomotor capture was associated with a large cost in RT such that responses were 80 ms slower when participants looked at the distractor, t(13) = 4.21, p = .001, d = 1.04, indicating that value-driven oculomotor capture substantially impaired performance. For trials on which the distractor and target were presented on the same side of the visual field, participants were also more likely to make their initial saccade to the side of the visual field containing the target when a high-value distractor or a low-value distractor was also present on that side, as compared with when it was absent, t(13) = 2.76, p = .016, d = 0.74, and t(13) = 3.63, p = .003, d = 0.97, respectively.

In order to determine whether our measure of value-driven oculomotor capture reflected a spatially specific effect of distraction, the number of initial saccades falling within 1° of visual angle of the target, a high- or low-value distractor, and a nontarget stimulus was measured on distractor-present trials. The probability of looking at a nontarget stimulus was defined as the probability of initially fixating any nontarget stimulus divided by the number of nontarget stimuli present in the display (four nontargets on distractor-present trials). This analysis revealed that initial fixations occurred to the target 46 % of the time, to the formerly rewarded distractor 22 % of the time, and to any given nontarget stimulus 8 % of the time. The probability of fixating a distractor was significantly greater than the probability of fixating any given nontarget stimulus, t(13) = 5.26, p < .001, d = 1.40, and this difference was evident even in the last block of trials, t(13) = 2.40, p = .032, d = 0.64.

The magnitude of value-driven oculomotor capture was significantly correlated with VWM capacity across individuals (Fig. 4b), r = −.693, p = .006. Thus, individuals with relatively low VWM capacity were more susceptible to value-driven capture than were those with relatively high capacity. The magnitude of value-driven oculomotor capture was not significantly correlated with the percentage of trials on which a saccade occurred, r = .191, p = .513, meaning that this relationship was not driven by individual differences in the tendency to initiate an eye movement versus remain fixated.

Experiment 2

Previous studies from our lab have shown that value-driven attentional capture is not driven merely by the fact that the distractors had previously served as targets but is specifically attributable to prior reward learning. In two previous studies, we employed even longer training phases in which no reward feedback was provided. Under these conditions, former-target-colored stimuli do not capture attention in a subsequent test phase (Anderson et al., 2011a, b). In one recent experiment, we show that under certain conditions, former targets can have even less attentional priority than former nontargets, perhaps because of their perceptual novelty (Anderson et al., 2012). These results undermine a “former-target bias” as an explanation for our results. The development of automatic orienting to former targets is thought to arise only after a substantial amount of experience that far exceeds the 300 trials used in the training phase of our Experiment 1 (e.g., Kyllingsbaek, Schneider, & Bundesen, 2001; Shiffrin & Schneider, 1977). Under conditions of relatively little training, as in the present experiments, attentional control settings can be rapidly updated to reflect changing task demands (Lien, Ruthruff, & Johnston, 2010).

It is unclear, however, whether the observed increase in pupil dilation across trial block in the training phase of Experiment 1 is specifically attributable to reward learning and prediction. This increase may, instead, reflect a more general aspect of learning or experience, and without a no-reward baseline, it is impossible to distinguish between these possibilities. To definitively determine whether the effect of rewarded targets on pupil size was driven specifically by reward prediction, we conducted a version of the training phase that was equivalent to the training phase of Experiment 1, except that monetary reward feedback was not provided.

Method

Participants

Fourteen new participants were recruited from the Johns Hopkins University community. All were screened for normal or corrected-to-normal visual acuity and color vision. Participants were compensated with extra credit toward one of a variety of undergraduate psychology courses.

Apparatus

The apparatus was identical to that used in Experiment 1.

Experimental task

The experimental task was identical to that in the training phase of Experiment 1, with the exception that the reward feedback display was replaced with a white check mark if participants responded correctly and a white "X" if participants responded incorrectly. The experiment took approximately 1 h to complete.

Instructions

Participants were given the same instructions as in Experiment 1 concerning the experimental stimuli and how they were expected to perform.

Eye tracking

Eye tracking was performed using the same procedures as those used in Experiment 1.

Data analysis

The data analysis was similar to that used in Experiment 1, except that only pupil size, and not saccade direction, was analyzed.

Results and discussion

Participants performed the task with a mean overall accuracy of 90 %. Importantly, pupil size did not increase throughout the course of training as it did in Experiment 1 (see Fig. 5). Pupil dilation to the target display did not differ significantly across trial block, F(4, 52) = 2.31, p = .070 [linear trend: F(1, 13) = 1.43, p = .254]. Comparing Experiments 1 and 2, the presence of reward feedback during the training phase significantly interacted with the effect of trial block on mean pupil dilation, F(4,104) = 2.70, p = .035, η 2p = .094 [linear trend: F(1, 26) = 4.61, p = .041, η 2p = .150], and the effect of trial block on mean pupil dilation was greater in magnitude in Experiment 1 than in Experiment 2 (p < .001, randomization test on the difference in the F-value for the main effect of block). This result demonstrates that reward learning indeed potentiated the effect of pupil dilation observed in Experiment 1, causing the pupils to increasingly dilate in response to the reward-predictive targets as the association between those targets and the delivery of reward was learned.

Fig. 5
figure 5

Mean percent change in pupil diameter (relative to the 200-ms interval before trial onset) by trial block during the training phase of Experiment 2

General discussion

Physically salient, goal-related, and valuable stimuli all capture covert attention involuntarily (e.g., Anderson et al., 2011b; Folk et al., 1992; Theeuwes, 1992). Salient and goal-related stimuli also elicit oculomotor capture (e.g., Ludwig & Gilchrist, 2002, 2003; Theeuwes et al., 2003). The present results reveal that stimuli imbued with value via reward learning also draw eye movements involuntarily, even when they are inconspicuous, not task-relevant, and currently unrewarded.

Our results extend to the oculomotor domain prior evidence that reward-associated stimuli have high attentional priority (Della Libera & Chelazzi, 2009; Della Libera, Perlato, & Chelazzi, 2011; Krebs, Boehler, & Woldorff, 2010; Navalpakkam, Koch, Rangel, & Perona, 2010; Peck, Jangraw, Suzuki, Efem, & Gottlieb, 2009; Raymond & O'Brien, 2009; Serences, 2008; Serences & Saproo, 2010) and capture covert attention involuntarily (Anderson et al., 2011a, b, 2012). During the training phase, participants exhibited higher eye movement accuracy for high-reward targets. In the test phase, participants were persistently more likely to move their eyes toward formerly rewarded but irrelevant stimuli.

The difference in the degree of impairment caused by the high- and low-value distractors was consistent but fairly small and not statistically significant in the present study; this echoes our previous findings (Anderson et al., 2011b). When the distractor is presented in the test phase of these studies, it has always been the most valuable stimulus in the display, regardless of the absolute amount of reward it was associated with previously. Therefore, a mechanism of value-based selection that automatically orients the observer to the most valuable stimulus in the display would be expected to behave similarly for high- and low-value distractors in our experiments. The degree to which value-based attentional and oculomotor priority is sensitive to the absolute reward value of stimuli is unknown. It would be informative for future studies to place differently valued distractors in direct competition with one another. Our own and other research suggests that the difference in attentional capture between high- and low-value distractors is more pronounced when these distractors are also physically salient (Anderson et al., 2011a, 2012; Hickey et al., 2010a, b, 2011).

The results of Experiment 1 demonstrated value-driven oculomotor capture even though eye movements were neither encouraged nor required to perform the experimental task. This suggests that, at least in the case of oculomotor capture by valuable stimuli, saccades to stimuli with sufficiently high attentional priority cannot easily be suppressed. This finding has important implications for theories of oculomotor control during naturalistic eye movements, since it implies that oculomotor capture can occur as the result of an involuntary process of selection that does not critically depend on the voluntary initiation of an eye movement.

Individual differences in VWM capacity are thought to reflect differences in the degree to which individuals are capable of resisting distraction, and it is known that VWM capacity is correlated with the degree to which individuals can effectively exert top-down control to prevent attentional capture (Fukuda & Vogel, 2009, 2011). Recently, we reported that individual differences in VWM capacity are also correlated with individual differences in the magnitude of impairment caused by formerly rewarded distractors (Anderson et al., 2011b). In the present study, we extend this finding to the domain of oculomotor capture by showing that VWM capacity is correlated with the degree to which formerly rewarded stimuli capture the eyes.

When an organism learns that a stimulus predicts reward (Schultz et al., 1997; Waelti et al., 2001), the appearance of that stimulus comes to elicit anticipatory arousal that is reflected in dilations of the pupil (e.g., O'Doherty et al., 2006). Throughout the course of training, we tracked reward learning by measuring changes in pupil diameter elicited by the target display. The pupils dilated little to the target display at the beginning of training, when the reward contingencies were not well learned, and dilation increased sharply by the second block of trials and continued to increase gradually as reward learning continued during the training phase. The results of a control experiment (Experiment 2) confirmed that reward potentiates this effect of the target display on pupil dilation.

Our findings address several fundamental issues concerning the role of reward in attention and perception. By measuring changes in pupil dilation, we provide evidence that stimulus–reward associations develop for reward-predicting targets of visual search, which guides both attention and the eyes to high-value stimuli. The learning of these associations is accompanied by involuntary attentional and oculomotor selection of formerly rewarded stimuli, even when these stimuli are irrelevant to the task and no longer predict reward.

The effect of the formerly rewarded distractors on eye movements provides compelling evidence that the influence of learned value on perception is spatially localized and stimulus specific, reflecting a distinctly selection-based mechanism. The involuntary effects of formerly rewarded but irrelevant stimuli are particularly pronounced in individuals who exhibit poor attentional control (i.e., those with low VWM capacity), which argues against a persisting value-based strategy as an explanation for our results.

These experiments show that in healthy and otherwise cognitively unimpaired individuals, reward learning has a powerful and persistent influence on both overt and covert stimulus selection. This influence on selection does not rely upon a specific demand of the ongoing task but, instead, occurs naturally in visual search. These findings may have important implications for studies of drug addiction and related syndromes in which previously rewarding stimuli can have a powerful influence on overt and covert selection processes that persist despite contravening goals (e.g., Field & Cox, 2008; Garavan & Hester, 2007; Robinson & Berridge, 2008).