Introduction

Maintaining and manipulating visual information is a core capability that is essential not only for higher cognitive functions, but also for simple tasks that we perform hundreds of times every day. Given the highly limited capacity of visual working memory (VWM; Cowan, 2001; Luck & Vogel, 1997), there is a need to update its contents in a dynamic manner. In recent years, it has been shown that attention can be oriented toward representations in visual working memory (VWM), yielding better memory for the respective items in the focus of attention (e.g., Astle, Summerfield, Griffin, & Nobre, 2012; Griffin & Nobre, 2003; Murray, Nobre, Clark, Cravo, & Stokes, 2013). This allows for a flexible modulation of VWM contents: Depending on how relevant information is or how likely it is to become relevant again, information can be maintained within or outside the focus of attention, or it can even be irretrievably excluded from memory (Heuer & Schubö 2016, manuscript submitted for publication; Williams, Hong, Kang, Carlisle, & Woodman, 2013; Zokaei, Ning, Manohar, Feredoes, & Husain, 2014). Experimentally, attentional selection in VWM is typically investigated with so-called retrocues, which are presented during the retention interval of a VWM task and mark specific items as more behaviorally relevant than others by cueing the locations at which these had been presented.

It has proven fruitful to study the deployment of attention in the mnemonic domain in relation to the already better understood deployment of attention in the perceptual domain. Attentional orienting in these two domains will in the following sections be referred to as internal and external attention (Chun, Golomb, & Turk-Browne, 2011; Kiyonaga & Egner, 2013). Previous research has shown that internal and external attention recruit a largely overlapping neural network involving parietal, frontal, and occipital areas (Nee & Jonides, 2009; Nobre et al., 2004; Tamber-Rosenau, Esterman, Chiu, & Yantis, 2011), and that they are strikingly similar with respect to the benefits and costs resulting from valid and invalid cueing (Astle, Scerif, Kuo, & Nobre, 2009; Griffin & Nobre, 2003). Building on this demonstration of a high degree of correspondence, in the present experiments we investigated whether internal attention is also similar to external attention with regard to the type of information that can be used to guide attention.

Feature-based and spatial attentional orienting

For external attention, the types of stimulus characteristics that can be used for attentional selection have been studied extensively. For example, spatial locations (Carrasco, 2011; Posner, 1980); single features such as the color, shape, or direction of motion (e.g., Bichot, Rossi, & Desimone, 2005; Martinez-Trujillo & Treue, 2004; Saenz, Buraĉas, & Boynton, 2002); a conjunction of features (e.g., Buracas & Albright, 2009; Nordfang & Wolfe, 2014; Weidner & Müller, 2013); or categories of more complex objects, such as faces (e.g., New, Cosmides, & Tooby, 2007; Serences, Schwarzbach, Courtney, Golay, & Yantis, 2004; Theeuwes & Van der Stigchel, 2006) can all be attentionally selected. At a broader level, a distinction is typically drawn between feature-based and spatial attention (e.g., Carrasco, 2011; Eimer, 2014).

Attentional mechanisms operating on different types of information are important for optimizing the visual system, because they allow us to flexibly rely on whatever kind of relevant information is available. When we are meeting a friend in a crowded place, we may monitor the area surrounding the fountain where he said he would wait (spatial attention) and look out for blue objects, because we expect him to wear a blue jacket (feature-based attention). Or we may have no idea where we left that scarf, so we will scan the room for yellow objects.

In light of the limited capacity of VWM, a similar flexibility with respect to the type of maintained information that can be attentionally selected would be highly advantageous. Typically, however, the investigation of internal attention has employed retrocues that provide spatial information as to the behavioral relevance of objects previously presented at specific locations. In fact, studies have almost exclusively been based on spatial retrocues (e.g., Astle et al., 2012; Matsukura, Luck, & Vecera, 2007; Nobre et al., 2004; Sligte, Scholte, & Lamme, 2008) or ones that cued entire object categories (Lepsien & Nobre, 2007; Lepsien, Thornton, & Nobre, 2011). Only a few studies have used retrocues relying on other types of information, and the question of whether the distinction between feature-based and spatial attention also applies to internal attention has so far been largely neglected.

The first to systematically investigate different retrocue types were Berryhill, Richmond, Shay, and Olson (2012). They presented participants with a retrocue consisting of either a number that mapped onto the location of one item, a dash appearing at the location of one item, or an arrow pointing to one location. A retrocueing benefit, as compared to a condition with an uninformative neutral cue, was only found for arrow retrocues, from which the authors concluded that this cueing benefit does not generalize across various cue types, as has been demonstrated for the perceptual domain. These null effects for other types of retrocues, however, might have been due to experimental details that left participants unable or unwilling to make use of the cue. For example, the interval between the retrocue and the probe item was unusually short (400 ms), and especially a symbolic retrocue such as a number might require more time to be processed. Moreover, the retrocue types varied randomly on a trial-by-trial basis, and such a frequent switch between retrocue types might require additional effort. Thus, although the findings of Berryhill et al. demonstrated that the arrow retrocue is particularly robust, they do not necessarily preclude that other retrocue types could also yield a benefit for cued items. Indeed, Pertzov, Bays, Joseph, and Husain (2013) found that retrocueing an object’s color was just as advantageous as retrocueing an object’s spatial location, in a task in which memory for a third feature (orientation) was assessed. Similarly, Li and Saiki (2014) observed equivalent benefits for color and location retrocues when color–location conjunctions were to be memorized. However, as was pointed out by Pertzov et al., it is possible that participants still adopted a spatial strategy when presented with a feature-based retrocue. They might have used the information provided by the cue, such as the color of the item, to retrieve information about the item’s location and then to deploy their attention to that location. If this were the case, feature-based and spatial retrocues would rely on the same space-based mechanism of attention. To examine whether these two types of retrocues rely on the same or on different attentional mechanisms, one can draw on certain differences between feature-based and spatial attention that have been established for the external domain.

Allocating attention to noncontiguous locations

For external attention, there is evidence that feature-based and spatial attention operate differently and rely on slightly different cortical regions (Giesbrecht, Woldorff, Song, & Mangun, 2003; Greenberg, Esterman, Wilson, Serences, & Yantis, 2010; Schenkluhn, Ruff, Heinen, & Chambers, 2008; Slagter et al., 2007; Vandenberghe, Gitelman, Parrish, & Mesulam, 2001). Of particular interest in the present context are differences between feature-based and spatial mechanisms in the allocation of attention to noncontiguous locations. Feature-based attention has been shown to modulate the activity of neurons throughout visual cortex, enhancing the processing of stimuli with behaviorally relevant features across the entire visual field, within and outside the spatial locus of attention (Maunsell & Treue, 2006; Saenz et al., 2002; Sàenz, Buraĉas, & Boynton, 2003; Treue, 2003). Thus, feature-based attention can be allocated in parallel to multiple locations. Whether the same holds true for spatial attention is still under debate (Cave, Bush, & Taylor, 2010a, 2010b; Eimer, 2014; Eimer & Grubert, 2014; Jans, Peters, & De Weerd, 2010a, 2010b).

For internal attention, it has not yet been investigated whether the representations of objects presented at noncontiguous locations can be selected and protected, just like those of objects presented at contiguous locations. This is not just an interesting question in itself, but may also shed light on whether the effects of feature-based and spatial retrocues do indeed rely on different attentional mechanisms. The pronounced similarities between internal and external attention can be taken to suggest that internal feature-based and spatial attention might differ with respect to enhancing representations within the spatial layout of VWM, just as external feature-based and spatial attention differ with respect to enhancing perceptual processing across the visual field.

Rationale of the experiment

In two experiments, we investigated whether attention can selectively influence the contents of VWM operating on features just as well as on spatial locations. During the maintenance interval of a visual working memory task, a retrocue was presented that indicated two of the memorized items as being relevant. The efficacy of different types of spatial and feature-based retrocues was tested (see Fig. 1). In addition to a typical spatial retrocue (an octagram with blackened corners pointing toward two locations), we also presented a more symbolic spatial retrocue (numbers mapping onto two locations). Feature-based retrocues were a color retrocue (a blob the color of two of the memorized items) in the first experiment, and a shape retrocue (an outline of the shape of two of the memorized items) in the second experiment. To determine whether these feature-based and spatial retrocues relied on different attentional mechanisms and were not, in the case of feature-bases retrocues, simply (re)coded into spatial information, we examined whether they differed with respect to access to the representations of items presented at contiguous and noncontiguous locations. To this end, either two neighboring or two nonneighboring items were cued. It should be noted that an ongoing debate concerns the number of items that can be retroactively cued with resulting benefits: Whereas some studies have only observed benefits for cueing one item (e.g., Makovski & Jiang, 2007), others have also found benefits for cueing two items (e.g., Poch, Campo, & Barnes, 2014). This debate and the implications of the present study with respect to this issue are revisited more thoroughly in the Discussion section. To allow for the presentation of items spaced closely enough to be considered as neighboring or nonneighboring without exceeding mean VWM capacity, the task was lateralized. In each trial, participants were to memorize four items presented in one hemifield.

Fig. 1
figure 1

Trial procedure and retrocue types for Experiments 1 (top) and 2 (bottom). A trial started with a precue presented above fixation for 200 ms, indicating the relevant hemifield for that trial. After an interval of 800 ms, a memory array consisting of eight items was presented for 200 ms. Participants were to remember the four orientations (Exp. 1) or colors (Exp. 2) of the items in the indicated hemifield. After another interval of 800 ms, the retrocue was presented. This retrocue was either informative, indicating two of the previously presented memory items as being task-relevant, or neutral (“X”), providing no information as to the task - relevance of specific items. Three different informative retrocue types were presented in each experiment. The spatial retrocues were octagrams with two blackened corners pointing to two locations of previously presented memory items. Symbolic retrocues consisted of two numbers mapping onto two of the four spatial locations in each hemifield (1 to 4, from top to bottom). These two types of spatial retrocues were identical in both experiments. The feature-based retrocue was a blob of the color (Exp. 1) or an outline of the shape (Exp. 2) of two of the previously presented memory items. After an interval of 800 ms, a probe item was presented at one of the memory item locations. Participants were to indicate whether this probe item was of the same orientation (Exp. 1) or color (Exp. 2) as the memory item that had previously been presented at that location. In trials with an informative retrocue, one of the two items indicated by the retrocue was probed. In trials with a neutral retrocue, one of the initial four memory items was tested

On the basis of the high similarity between external and internal attention, as well as the few previous studies reporting benefits for retrocues based on nonspatial information that we discussed above, we predicted overall retrocueing benefits for all retrocue types. Crucially, however, we expected to observe differences between the feature-based and spatial retrocues in the analysis of neighboring and nonneighboring cued items. Specifically, for feature-based retrocues we expected to observe retrocueing benefits irrespective of the cued items’ spatial configuration—that is, independent of whether the items were neighboring or nonneighboring. For spatial retrocues, in contrast, we expected to observe large benefits for cueing neighboring items, but no or strongly attenuated benefits for cueing nonneighboring items. This pattern of results would corroborate the notion of internal feature-based and spatial attention mechanisms that operate in a manner similar to that of external feature-based and spatial attention.

Method

Participants

In all, 40 students from Philipps University Marburg participated in the experiments (Exp. 1: 16 female, seven male, mean age 22 years; Exp. 2: 13 female, four male, mean age 22 years), for which they received course credit or monetary compensation. All participants provided informed written consent, were naive to the purpose of the experiment, and had normal or corrected-to-normal visual acuity and color vision. Visual acuity and color vision were tested with the OCULUS Binoptometer 3 (OCULUS Optikgeräte GmbH, Wetzlar, Germany).

Apparatus

Participants were seated in a comfortable chair in a dimly lit and sound-attenuated room. Stimuli were presented on a 22-in. screen (1,680 × 1,050 pixels) placed at a distance of approximately 104 cm from the participants’ eyes. Stimulus presentation and response collection were controlled by a Windows PC using E-Prime 2.0 software (Psychology Software Tools, Inc.). Participants responded by pressing buttons on the back of a gamepad (Microsoft SideWinder USB).

Stimuli and procedure

All stimuli were presented against a gray background. There were eight fixed locations for the items in the memory array—four in each hemifield, and two in each quadrant. The memory items were arranged on an imaginary circle with a radius of approximately 4.68° of visual angle. They subtended an area of 1.10° × 0.28° of visual angle. The distance from the center of one item to the next was 3.31° of visual angle for neighboring items within one hemifield. The distance between two items that were next to each other but on opposite sides of the vertical midline was 4.68° of visual angle.

In Experiment 1, the orientations of the memory items were randomly chosen from a set of six orientations (15°, 45°, 75°, 105°, 135°, and 165°). The orientation of the probe item was either identical to the orientation of the memory item that had previously been presented at that location (on match trials) or was randomly chosen from the remaining set of orientations (on nonmatch trials). The colors of the memory items were chosen from a set of six colors (blue, green, lilac, ochre, orange, and pink). Two pairs of memory items in each hemifield were of the same color. The probe item was always presented in gray, and the colors of the memory and probe items were isoluminant.

In Experiment 2, the colors of the memory items were randomly chosen from a set of seven isoluminant colors (blue, green, ochre, orange, pink, red, and violet). A color could appear no more than once in each hemifield. The color of the probe item was either identical to the color of the memory item that had previously been presented at that location (on match trials) or was randomly chosen from the remaining set of colors (on nonmatch trials). As in Experiment 1, the memory and probe items were isoluminant. The shapes of the memory items were chosen from a set of three shapes (circles, squares, and triangles). Two pairs of memory items in each hemifield were of the same shape. The probe item was always of the same shape as the memory item that had previously been presented at the indicated location.

The precue subtended an area of 1.10° × 0.28° of visual angle. The neutral, spatial, color, and shape retrocues were 1.1° × 1.1° of visual angle in size, and the symbolic retrocue was 1.1° × 2.68° of visual angle in size. The fixation dot subtended an area of 0.02° of visual angle.

The procedure is illustrated in Fig. 1. A trial started with the presentation of a precue (an arrow) just above the fixation dot for 200 ms, which pointed toward the left or right, indicating the relevant hemifield for the trial. After an interval of 800 ms, a memory array was presented. This memory array consisted of eight memory items, and participants were asked to memorize the four items in the relevant hemifield. Participants had to remember the orientations of the items in Experiment 1, and the colors of the items in Experiment 2. After an interval of 800 ms, either an informative or a neutral retrocue was presented. An informative retrocue validly indicated two of the memory items. In Experiment 1, this was either a spatial retrocue (an octagram with two blackened corners pointing toward two locations), a symbolic retrocue (two numbers mapping onto two locations, with the locations being numbered from 1 to 4 from top to bottom in each hemifield) or a color retrocue (a blob the same color as two of the memory items). In Experiment 2, the spatial and symbolic retrocues were the same as in Experiment 1, and the feature-based retrocue was a shape retrocue (an outline the same shape as two of the memory items) instead of a color retrocue. In both experiments, there were also neutral trials, in which a noninformative neutral retrocue (an X) was presented. After another interval of 800 ms, a probe item was presented at one of the memory item locations, and participants were to indicate whether this probe item had the same orientation (Exp. 1) or color (Exp. 2) as the memory item that had previously been presented at that location, or a different one. In 50 % of all trials, a change in orientation or color occurred. In trials with an informative retrocue, the probe item was always presented at one of the locations of the two cued memory items. In neutral trials, the probe item was presented at one of the four memory item locations. The probe item was present until response, but participants were encouraged to respond quickly. They pressed a button with their left or right index finger, and the response assignment was balanced across participants.

Design

Testing took place in three (Exp. 1) or two (Exp. 2) sessions on separate days, with no more than two days between sessions. The first session or the first two sessions, respectively, were shorter training sessions, and these data were not entered into the analyses. The main experiments consisted of 1,152 trials, with 384 trials for each retrocue type (spatial, symbolic, and color/shape), organized in blocks of 32 trials each. Retrocue type was varied blockwise, with a change in retrocue type every three blocks. The order in which the retrocue types were presented was balanced across participants. Retrocue information (50 % informative, 50 % neutral) was varied trialwise. Thus, neutral retrocue trials were identical but interleaved with different types of informative retrocues. In half of all trials with informative retrocues, the two cued items were items that had been presented at neighboring locations (at Locations 1 and 2, 2 and 3, or 3 and 4, with locations numbered from top to bottom in each hemifield), and in the other half, the cued items had been presented at nonneighboring locations (at Locations 1 and 3, 1 and 4, or 2 and 4).

Data analyses

Trials with excessively long reaction times (>2.5 SDs from the mean reaction time, calculated separately for each participant) were excluded from further analysis (on average, 2.8 % of all trials in both experiments). For each experimental condition, accuracy in percent, mean reaction time, and an index of sensitivity (d-prime [d′]) were calculated. For reaction times, only correct responses were included. The d′ scores were calculated as d′ = z(hit rate) – z(false alarm rate).

Repeated measures analyses of variance (ANOVAs) with the factors Retrocue Type (spatial, symbolic, and color) and Retrocue Information (informative and neutral) were computed, followed by pairwise comparisons when appropriate. Only the significant comparisons are reported. In addition, one-tailed t tests comparing performance for informative retrocues indicating neighboring and nonneighboring items against performance in the respective neutral conditions were computed, separately for each retrocue type.

Results

Experiment 1

The results of Experiment 1 are illustrated in Figs. 2 and 3, top panels. A significant main effect of retrocue type was observed for all measures [accuracy, F(2, 44) = 5.76, p = .006, η p 2 = .21; reaction time, F(2, 44) = 5.51, p = .007, η p 2 = .20; d′, F(2, 44) = 3.90, p = .027, η p 2 = .15]. Performance was best for color retrocue blocks, followed by spatial and symbolic retrocues. Pairwise comparisons revealed that performance for color retrocues was significantly better than performance for spatial retrocues (reaction time, 17.71 ± 7 ms, p = .035) and for symbolic retrocues (accuracy, 2.10 % ± 0.6 %, p = .13; reaction time, –21.81 ± 8 ms, p = .024; d′, 0.15 ± 0.1, p = .035). Importantly, performance was significantly better in informative retrocue trials than in neutral retrocue trials across retrocue type conditions, as was shown by a significant main effect of retrocue information [accuracy, F(1, 22) = 21.46, p < .001, η p 2 = .49; reaction time, F(1, 22) = 183.56, p < .001, η p 2 = .89; d′, F(1, 22) = 16.14, p = .001, η p 2 = .42]. Thus, there were significant benefits for informative as compared to neutral retrocues. Significant interactions [accuracy, F(2, 44) = 12.01, p < .001, η p 2 = .35; reaction time, F(2, 44) = 18.31, p < .001, η p 2 = .45; d′, F(2, 44) = 8.24, p = .001, η p 2 = .27] revealed differences in the magnitudes of these benefits for informative as compared to neutral trials between retrocue type conditions. As is shown in Fig. 2 (top), the difference between informative and neutral retrocue trials was largest for color retrocues, followed by spatial and symbolic retrocues. Pairwise comparisons revealed significant differences between the retrocueing benefits for color retrocues and symbolic retrocues (accuracy, 4.85 % ± 1.3 %, p = .003; reaction time, –60.0 ± 12 ms, p < .001; d′, 0.37 ± 0.1, p = .029) and for color retrocues and spatial retrocues (accuracy, 4.02 % ± 0.9 %, p = .001; d′, 0.44 ± 0.1, p = .003), as well as a difference between spatial retrocues and symbolic retrocues in terms of reaction times (–43.82 ± 10 ms, p = .001).

Fig. 2
figure 2

Overall retrocueing benefits in Experiments 1 (top) and 2 (bottom). For each experiment, accuracy in percent (left column), mean reaction time (middle column), and d′ (right column) for trials with informative (dark gray) and neutral (light gray) retrocues are shown separately for the three different retrocue types (spatial, symbolic, and color/shape). Error bars show the standard errors of the means, and asterisks mark significant retrocueing benefits—that is, significant differences between informative and neutral retrocues. * p < .05; ** p < .01; *** p < .001

Fig. 3
figure 3

Retrocueing benefits for neighboring and nonneighboring items in Experiments 1 (top) and 2 (bottom). For each experiment, accuracy in percent (left column), mean reaction time (middle column), and d′ (right column) for retrocues indicating neighboring (dark gray) or nonneighboring (medium gray) items and for neutral retrocues (light gray) are shown separately for the three different retrocue types (spatial, symbolic, and color/shape). Error bars show the standard errors of the means, and asterisks mark significant retrocueing benefits—that is, significant differences between informative and neutral retrocues. * p < .05; ** p < .01; *** p < .001

In a second step, we examined the retrocueing benefits for neighboring as well as for nonneighboring cued items (Fig. 3). We found significant retrocueing benefits for neighboring cued items for all retrocue types and all measures [spatial: accuracy, t(22) = 4.98, p < .001; reaction time, t(22) = –8.64, p < .001 ; d′, t(22) = 2.7, p = .013; symbolic: accuracy, t(22) = 2.84, p = .01; reaction time, t(22) = –3.64, p = .001; d′, t(22) = 4.05, p = .001; color: accuracy, t(22) = 9.98, p < .001; reaction time, t(22) = –11.01, p < .001; d′, t(22) = 6.72, p < .001]. For nonneighboring cued items, in contrast, significant retrocueing benefits for all measures were only observed with color retrocues [accuracy, t(22) = 3.16, p = .005; reaction time, t(22) = –11.04, p < .001; d′, t(22) = 6.72, p < .001]. In terms of reaction time, there was also a retrocueing benefit for spatial retrocues [t(22) = –5.05, p < .001].

Experiment 2

The results of Experiment 2 are illustrated in Figs. 2 and 3, bottom panels. The same analyses were conducted as for Experiment 1. Repeated measures ANOVAs with the factors Retrocue Type (spatial, symbolic, and shape) and Retrocue Information (informative and neutral retrocues) revealed significant main effects of retrocue type for accuracy [F(2, 32) = 3.94, p = .029, η p 2 = .20] and d′ [F(2, 32) = 3.3, p = .048, η p 2 = .17], with better performance for shape than for spatial and symbolic retrocues. Pairwise comparisons revealed significant differences between shape and symbolic retrocues (accuracy, 1.80 % ± 0.6 %, p = .026; d′, 0.12 ± 0.04, p = .039). In addition, across retrocue type conditions, performance was better in informative retrocue trials than in neutral retrocue trials [accuracy, F(1, 22) = 38.07, p < .001, η p 2 = .70; reaction time, F(1, 22) = 30.55, p < .001, η p 2 = .66; d′, F(1, 22) = 32.04, p < .001, η p 2 = .67]. The interaction between retrocue type and retrocue information only reached significance for reaction times [F(2, 32) = 8.07, p = .001, η p 2 = .34]. The benefit of informative as compared to neutral retrocues was larger for shape retrocues than for spatial and symbolic retrocues. The difference between the benefits for shape and symbolic retrocues was significant (50.70 ± 12 ms, p = .002).

In addition, we observed significant retrocueing benefits for neighboring cued items for all retrocue types and all measures [spatial: accuracy, t(16) = 2.86, p = .006; reaction time, t(16) = –5.98, p < .001; d′, t(16) = 3.5, p = .002; symbolic: accuracy, t(16) = 3.49, p = .002; reaction time, t(16) = –4.47, p < .001; d′, t(16) = 2.49, p = .012 ; shape: accuracy, t(16) = 4.62, p < .001; reaction time, t(16) = –7.45, p < .001; d′, t(16) = 3.54, p < .002]; see Fig. 3. For nonneighboring cued items, as in Experiment 1, significant retrocueing benefits for all measures were only observed for feature-based retrocues—that is, for the shape retrocues [accuracy, t(16) = 3.54, p = .002; reaction time, t(16) = –6.18, p < .001; d′, t(16) = 3.10, p = .004]. In terms of reaction times, there was also a significant retrocueing benefit for spatial retrocues [t(16) = –2.38, p = .015].

Discussion

The present experiments demonstrate that attention can select representations maintained in VWM not only on the basis of spatial location, but also on the basis of features of the represented objects. Retrocueing benefits for informative spatial and symbolically spatial retrocues, relative to a neutral retrocue, were observed, as well as for two different types of feature-based retrocues. Not only was retrocueing an object’s feature just as advantageous as retrocueing an object’s location, but when there were differences in the magnitudes of the observed benefits, the benefit was even larger for feature-based than for spatial retrocues. This was the case when color was the feature used for cueing (Exp. 1) and, to a lesser degree, also observed when object shape was the cued feature (Exp. 2). It should be noted that the number retrocues, for which a previous study had not observed benefits (Berryhill et al., 2012), yielded significant but relatively small benefits. This may be the result of the symbolic nature of number retrocues, which rely on an experimenter-defined mapping of numbers onto locations. Therefore, these retrocues might require more time or effort to be decoded than do more intuitive and overlearned cues, such as the blackened corners of the spatial retrocues, which resemble arrowheads. Some of the experimental parameters, such as the addition of a training session, a blockwise variation of retrocue type, and a longer interval between retrocue and probe item, might thus explain the seemingly divergent findings with respect to the efficacy of number retrocues. Interestingly, effects of number cues have also been reported to be smaller than the effects of other cues, especially arrow cues, for external attention (e.g., Olk, Tsankova, Petca, & Wilhelm, 2014 ; Ristic & Kingstone, 2006).

The smaller overall benefits for the two spatial retrocues than for the feature-based retrocues can be accounted for by differences in the benefits for neighboring and nonneighboring cued items. Whereas there were benefits for both neighboring and nonneighboring cued items for the two feature-based retrocues, a benefit in performance for the two spatial retrocues was only found for neighboring cued items. Thus, the attentional mechanisms underlying the effects of spatial and feature-based retrocues differ with respect to their access to representations of objects presented at contiguous and noncontiguous locations. These results support the idea that these attentional mechanisms are qualitatively different. Specifically, they suggest that, just as in the perceptual domain, there is a distinction between feature-based and spatial attentional selection at the mnemonic level. Whereas internal feature-based attention enhances representations with behaviorally relevant features across the spatial layout of VWM, internal spatial selection of noncontiguous representations appears to be harder to realize, if not impossible.

The present findings support the notion of similar external and internal mechanisms of feature-based and spatial attentional selection, and add to a growing body of evidence demonstrating a high degree of similarity between external and internal attention. Similarities have been observed with respect to behavioral consequences and neural networks (e.g., Griffin & Nobre, 2003; Nee & Jonides, 2009; Tamber-Rosenau et al., 2011). In light of this evidence, one might question whether making a distinction between external and internal attention is appropriate at all. However, there are also notable differences between the two domains of attentional orienting. For example, neuroimaging studies have found an additional involvement of certain medial and lateral prefrontal areas and stronger activations in parietal regions when the focus of attention is controlled within VWM (Nee & Jonides, 2009; Nobre et al., 2004; Tamber-Rosenau et al., 2011). Moreover, internal attention has also been shown to exhibit distinct behavioral characteristics: Unlike external shifts of attention, internal shifts of attention appear not to be influenced by the initial physical distance of the encoded objects, and beyond a certain threshold time, additional time for internal shifts does not yield additional benefits (Tanoue & Berryhill, 2012).

In addition, as was pointed out by Chun et al. (2011), a distinction between internal and external attention is useful for organizing and structuring research and findings, and it does not necessarily imply that the mechanisms are truly separate. Importantly, in Chun et al.’s taxonomy, they place working memory at the interface of internal and external attention. The content that attention operates on in VWM is clearly internal: These are representations of information that is no longer physically present in the outside world. However, there is a substantial overlap between VWM and (external) attention (Awh & Jonides, 2001; Gazzaley & Nobre, 2012), and it has been suggested that the two mechanisms might share an underlying resource (Anderson, Vogel, & Awh, 2013; Kiyonaga & Egner, 2013). Their linkage is, for instance, indicated by evidence showing that VWM’s content influences the deployment of external attention (e.g., Feldmann-Wüstefeld & Schubö, 2015; Olivers, Peters, Houtkamp, & Roelfsema, 2011; Soto, Heinke, Humphreys, & Blanco, 2005).

Thus, internal and external attention exhibit both commonalities and differences, and the investigation of the relationship between the two domains of attentional orienting continues to be particularly important in order to clarify in how far these attentional mechanisms are distinct. The present experiments contribute to our understanding of the overlap of perceptual and mnemonic processing by showing that internal feature-based and spatial attentional selection mechanisms resemble the ways that features and spatial locations are selected in the external world.

The units of VWM: Objects, features, or both?

We perceive and remember our visual surroundings in terms of objects and not as an assemblage of features, and the unit for storage in VWM is predominantly considered to be integrated objects rather than individual features (Cowan, 2001; Luck & Vogel, 1997; Luria & Vogel, 2011). The present findings are consistent with such an object-based account of VWM. Participants were instructed to remember one feature of the objects (orientation in Exp. 1, color in Exp. 2), and not the other features that were used to cue specific items. Nevertheless, these cued features could be used to weigh the items with respect to their behavioral relevance. Thus, by accessing one feature of an object, memory for another feature of that object could be improved. It is conceivable that in the present experiments, the features used to cue specific items were automatically encoded and maintained along with the to-be-memorized features. In this context, it should also be pointed out that performance in a change detection task such as the one used here always depends on one other feature that participants are almost never explicitly instructed to memorize and that is not tested, but that appears to be available nonetheless: location. Indeed, there is evidence that relative location is encoded along with information about object identity (Olson & Marshuetz, 2005). Alvarez and Cavanagh (2004) suggested that an obligatory set of features, including but not limited to location information, might form an object representation. Moreover, it has been shown that even task-irrelevant features are automatically encoded along with the relevant features of objects, and that only subsequent maintenance is under voluntary control (Marshall & Bays, 2012; Xu, 2010). Given that both location and the respective feature used for retrocueing were relevant for the task at hand, it seems reasonable to assume that, after an initial automatic encoding, these features were maintained along with the tested feature in an object-based manner. Object-based storage in VWM could also explain how a retrocue relying on one feature can yield a benefit in memory for another feature of the particular object, since it has been shown that attending to one feature of an object enhances the neural representation not only of that feature, but also of other features of the object (O’Craven, Downing, & Kanwisher, 1999).

Recently, an alternative framework to object-based models has been proposed, according to which the representational units of VWM are not objects, but features (Fougnie & Alvarez, 2011; see also Wheeler & Treisman, 1993). Fougnie and Alvarez observed that reports for different features of an object were largely independent, meaning that even when one feature was unknown, another feature of the same object could be successfully reported. They postulated that VWM is organized in self-sustaining representations that may fail independently for each feature, and that the independence of the maintained features is determined by the degree of overlap in their neural coding during perception. The memory representations for color and orientation, and also for shape and color, would consequently be largely independent, which is hard to reconcile with the present findings. Similarly, Rajsic and Wilson (2014) have suggested a location-based representational organization, with nonspatial features nested within locations. Such a model, according to which representations are indexed by location, would seem to imply that a spatial code is necessary for the attentional selection of information in VWM. Thus, it cannot account for our findings indicating that a nonspatial feature of an object can be accessed just as well via another nonspatial feature.

Although our findings support an object-based organization of VWM, they might have been influenced by task-specific demands. Consistent with the general notion of a high flexibility of VWM, and offering a possible explanation for the divergent findings with respect to its basic units, a recent study by Vergauwe and Cowan (2015) suggests that the representational units in VWM might also be flexible to some degree. Vergauwe and Cowan examined whether performance in a VWM task depended on the number of objects, the number of features, or both. Crucially, they explicitly either encouraged or discouraged the use of binding information through the instructions, and presented test probes either as integrated objects or as independent features. They found that the testing situation affected the unit for retrieval from VWM, in that it could result in a stronger emphasis on either the features or the objects. When the representation of objects was stressed by encouraging binding and presenting integrated object probes, retrieving multiple features did not take longer than retrieving a single feature. The authors concluded that the unit of VWM may not be fixed, and that relatively small changes in the experimental details can influence which unit is favored. The present experiments, one might argue, emphasized the coding of objects, in that the binding of a location and the to-be-memorized feature was required to perform the memory task itself, as well as to make use of the spatial retrocues. In order to incorporate the feature cue, this feature needed to be integrated with both the to-be-memorized feature and the location.

How many representations in VWM can be selected?

Although a number of state-based models of VWM have posited different activation states of representations that are established by the deployment of attention (LaRocque, Lewis-Peacock, & Postle, 2014), these models differ in their conceptualizations of how many items or elements can be selected. It has been proposed that only one item can be in the focus of attention (Oberauer, 2002; Olivers et al., 2011), thereby ensuring heightened processing efficiency for the selected information (Oberauer & Bialkova, 2009). Alternatively, the focus of attention might be able to encompass multiple items, up to its approximate limit of four (Cowan et al., 2005), allowing for a higher degree of flexibility and adjustability of the focus according to different task demands. Empirical evidence has been obtained for both a narrow focus of attention, of only one item (Garavan, 1998; Makovski & Jiang, 2007; Oberauer & Bialkova, 2009), and a broader focus of attention that can grasp multiple items (Cowan, 2011; Heuer & Schubö 2016, manuscript submitted for publication; Poch et al., 2014).

In the present experiments, participants had to select two representations in VWM, which yielded a benefit in memory performance relative to when no subset of representations was selected. Even though the retrocueing benefit is now well-established, the observation of such a benefit for multiple cued items is not trivial, given that it is still being debated how many elements can be selected from VWM at the same time. The overall benefits observed for informative retrocues support the notion that multiple items can be in the focus of attention at the same time. An even stronger indication that both cued items were indeed attentionally selected is provided by the analyses of performance for neighboring and nonneighboring cued items. Several alternative explanations for the observation of benefits for cueing multiple items do not imply a multiple-item focus of attention, as we discuss below. These alternative explanations, however, are rendered unlikely by the effects of the different spatial configurations of the two cued items.

First, it is possible that in fact only one item was selected, although two items were cued. This might still have resulted in better performance than on trials with a neutral retrocue, seeing that on a certain proportion of trials the selected item would have been tested. But if only one of the two cued items was in the focus of attention, it should not have made any difference for the observed benefits whether the cued items had been presented at contiguous or noncontiguous locations. Such a difference, however, is exactly what we found for the two spatial retrocues: Whereas there were significant benefits for items at neighboring locations, no benefits were observed for items at nonneighboring locations.

On a related note, one might argue that a single-item focus was rapidly shifted from one cued item to the other. This might have resulted in a reduction or obliteration of benefits for items that had been presented farther apart (i.e., nonneighboring items), because the time in-between the presentation of the retrocue and the probe item might not have sufficed to shift attention to the second item.Footnote 1 Importantly, however, this should have been the case not only for spatial retrocues, but also for feature retrocues, for which benefits for nonneighboring cued items were observed. Moreover, it has been shown that the distance between objects at encoding does not affect the time it takes to shift internal attention between the respective representations (Tanoue & Berryhill, 2012).

A third alternative explanation is that the two cued items were chunked, and thus effectively processed as one element (see also Oberauer & Bialkova, 2009). However, there is no obvious reason why chunking nonneighboring items would have been possible with feature-based retrocues, but not with spatial retrocues. Even though the feature-based retrocues might have highlighted commonalities between the two cued items—that is, their common color or shape—thereby enhancing the process of chunking, this certainly does not account for the absolute lack of benefits for nonneighboring cued items in trials with spatial retrocues. Instead, the results indicate that the two cued items were treated as two elements that were selected into a multiple-item focus of attention.

We would like to point out that the contradictory conclusions with respect to the number of items that can be in the focus of attention within VWM have been reached with very different paradigms, which may account for the divergent results. For example, a change detection task involving exogenous retrocues and a number of memory items beyond working memory capacity (Makovski & Jiang, 2007) and an arithmetic task requiring operations involving serially and centrally presented digits (Oberauer & Bialkova, 2009) might pose very different demands on attentional mechanisms. On the basis of the present experiments, we cannot rule out that in some situations only one element from working memory might be attentionally accessed. But our findings strongly support the notion that under specific circumstances the internal focus of attention can encompass multiple representations.

Conclusions

Attentional selection in VWM can operate on different types of information, which can be used to update the contents of VWM according to the behavioral relevance of specific objects. This demonstrates a high flexibility of the working memory system with respect to making use of whatever information is available, ensuring that limited storage capacity is used efficiently. The mechanisms of internal feature-based and spatial attention resemble the respective mechanisms in the external domain, in that only feature-based attention can reliably select and enhance noncontiguous representations throughout the spatial layout of VWM.