Introduction

Thirty years ago – when autism was a condition still primarily characterised by deficits such as abnormal social skills, delayed and impaired language, repetitive behaviors and restricted interests1,2 – an ‘islet’ of visual skills that matched or exceeded those of typical populations was demarcated3. What has emerged since then is a set of visual tasks (e.g. embedded figures4, block design5, visual search6,7, resistance to spatial crowding8) on which individuals with Autism Spectrum Disorder (ASD) outperform controls; expressions of what we term the ‘ASD advantage’ (for reviews, see refs. 9, 10). A better understanding of the ASD advantage would help constrain neurophysiological models of ASD etiology.

A particularly robust expression of the ASD advantage is found in visual search (for a review, see ref. 11). Visual search is a classic paradigm in visual psychophysics and an ASD advantage has been found in school-age children, adolescents and adults6,7,12,13. In this paradigm, the goal is to find a target object hidden among a variable set size of irrelevant distractor objects. This laboratory task captures a ubiquitous challenge – localizing objects in cluttered scenes – of everyday life. Some visual search tasks are trivially easy, for instance finding a red circle amidst a field of green circle distractors; so-called single-feature search, where the target pops-out by virtue of its unique colour. Some are more difficult, for instance, finding that same red circle, but now amidst blue circles and red squares; so-called feature-conjunction search where more effortful, deliberate search is required as set size increases14.

In a recent study, we found that even 2-year-old toddlers with ASD – arguably the youngest age at which a reliable diagnosis can be made – were more successful than age-matched, typically developing (TD) controls, by up to a factor of two, at visual search15. In that eye-tracking study, a successful search trial meant the toddler fixated the target within the 4-second search interval (see Fig. 1 and Method). Intriguingly at the time, this advantage accrued without any potentially explanatory differences in gaze behavior. As reported in ref. 15, groups did not have significant differences in the number of fixations they made, the percentage of trials excluded for never having fixated an item, the total time spent dwelling on items, the total time spent searching, or even the amount of time it took to get to the target on successful trials. Previously, it has been speculated that this sort of ASD advantage stems from ‘enhanced discrimination’12,16, whereby the target is perceptually amplified, raising its salience and thereby aiding search. Here we revisit that study and find that pupillometry reveals a more basic factor at work.

Figure 1
figure 1

Visual search displays.

In feature-conjunction trials, the target, a red apple, was presented amongst both same-colour and same-shape distractors. The total number of items defined the set size (9 and 5 are shown in panels a and b, respectively). In single-feature trials the target was presented amongst only a single distractor type and is therefore effortlessly found by virtue of its unique colour or shape (panel c). The search interval lasted 4 s, during which toddlers' gaze position and pupil dilation were monitored.

Task-evoked pupil responses, as opposed to the pupillary light reflex, are small (0.1–0.5 mm), but reliable and have long been taken as a sensitive, real-time, involuntary measure of attentional engagement and cognitive effort17,18. For instance, the pupil dilates more when attempting to read an incongruent word in the Stroop task19. This relationship has been found even in infants, for instance, 6- and 12-month-olds show greater dilation when viewing non-rational actions20 and 8-month-olds show greater dilation when viewing impossible events21. However, while differences in pupil dilation between ASD and TD groups have been found22, pupillometry has not yet been used to investigate the ASD advantage.

Activity of the locus coeruleus–norepinephrine (LC-NE) system that involves, among others, thalamic nuclei, the hippocampus, the basal lateral amygdala and the prefrontal cortex, is reflected in pupil dilation; pupil responses are a biomarker of its activity23,24,25. Two components of this activity have been isolated – tonic and phasic – that are positively reflected in obligatory, time-linked tonic and phasic pupil responses26. Tonic activity marks a relatively slow-changing modulation of general arousal, with low levels associated with drowsiness and high levels with distractibility. In contrast, phasic activity marks a rapid modulation of a focussed attentional state that facilitates performance on fixed, well-defined tasks, like visual search. Activity of the LC-NE system is causally related to performance on perceptual tasks via this control it exerts over attention27,28. This causal linkage between LC-NE system activity and task performance, by virtue of modulating attentional states that may or not be advantageous to task demands, was best established through direct manipulation of the LC. In monkeys, local microinfusion of clonidine to increase LC phasic activity increases performance on a visual task, while a suppressive agent (pilocarpine) has the opposite effect29. In humans, administration of modafinil to increase phasic activity induces a concomitant pupil response, yields task-related activity in cognitive control areas (shown by fMRI) and improves performance on a visual task30. For our purposes, phasic pupil responses provide a real-time measure of attentional focus, thereby indicating the extent to which ASD and TD participants are disposed to succeed on visual search. We hypothesise that greater phasic activity in the LC-NE system of the ASD group engenders greater attentional focus, thereby contributing to the ASD advantage.

Results

Here we compared task-evoked pupil responses for the ASD and TD groups in order to determine whether 1) the ASD group shows greater phasic pupil response, 2) greater pupil response is indeed positively correlated with higher success rates on visual search and, 3) the factor of phasic pupil dilation (our measure of LC-NE activity and thereby attentional state) alone is adequate to account for our search performance data.

The ASD group has greater tonic pupil dilation

Before comparing phasic pupil responses, it is useful to analyze baseline pupil levels, which reflect tonic levels of the LC-NE system. Tonic pupil levels have been shown to be as much as 1–1.2 mm greater in 2–6-year-olds with ASD31. The significant correlation with Autism Diagnostic Observation Schedule (ADOS-G, specifically) scores has even led to the suggestion of employing tonic pupil size as a means for early diagnosis32. In our study, by comparing baseline levels during the quiet period just prior to the onset of the search interval (see Method) for all trials in the mixed blocks, we also found a significantly higher tonic pupil level in the ASD group as compared to the TD group, albeit a more modest one. A two-tailed t-test comparing tonic levels was significant (pupil dilation difference between groups of 0.11 mm, t = 2.87, P = 0.004, Cohen'sd = 0.25; N trials [TD, ASD] = 229, 289). It has been argued that too low or too high tonic arousal can diminish the potential for phasic LC-NE responsiveness and thereby hamper focussed attention and performance on tasks that benefit from focussed attention, following an inverted U-shaped function33. The fact that both the ASD and TD groups have tonic pupil levels in a similar range and that both show significant phasic pupil modulations relative to this tonic level (see below), indicates that participants were operating within a similar, useful range of this general arousal function. To isolate task-evoked, phasic pupil responses, all subsequent analyses subtracted out the baseline pupil level (see Method and ref. 21).

The ASD group has greater phasic pupil response during visual search

For these analyses, data was grouped based on task difficulty (as determined by the performance data reported in ref. 15) into three conditions: ‘easy’ single-feature trials, ‘easy’ low set size 5 feature-conjunction trials and ‘difficult’ high set size 9 & 13 feature-conjunction trials, combined (there were too few set size 13 trials to be analyzed separately). Baseline-corrected pupil responses over the search interval were greater for the ASD group in all three cases (Fig. 2a, b and c, respectively; see also histograms showing pupil responses binned over all trials in Fig. 2d). Two-tailed t-tests comparing pupil dilations at the end of the search intervals found these differences to be statistically significant (t = 3.06, P = 0.003, Cohen'sd = 0.47; N trials [TD, ASD] = 76, 97; t = 3.03, P = 0.003, d = 0.51; N = 63, 81; and t = 2.76, P = 0.006, d = 0.40; N = 90, 108; respectively, for the three difficulty conditions). These results indicate that while both groups showed phasic responses, the ASD group exerted greater focussed attention during the search interval.

Figure 2
figure 2

Baseline-corrected pupil response during visual search.

Results are shown for single-feature, set size 5 feature-conjunction and set size 9/13 feature-conjunction conditions, averaged over trials and participants in each group (N = 17 for ASD group; N = 17 for age-matched TD controls; panels (a), (b) and (c), respectively). Error bars indicate standard errors of the mean. Two-tailed t-tests at the end of the search interval showed significantly greater dilation in the ASD group in all three cases. Histograms of dilation at the end of the search interval, for all trials, are shown for ASD and TD groups (panel d, top and bottom, respectively). Normal fits are shown. The ASD group shows greater phasic pupil response, indicative of greater phasic LC-NE activity and attentional focus.

Larger phasic pupil response is associated with better search performance and is sufficient to account for search data

We examined the relationship between phasic pupil dilation and performance in the same three conditions discussed above (it is worth noting here that we found no evidence in the TD group for an effect of gender on overall visual search performance (χ2 = 0.85, P = 0.357, two-tailed), nor on phasic pupil response (t(227) = −1.1914, P = 0.235, two-tailed). For this analysis, we used a generalised linear model (binomial logistic regression) using the trial-by-trial success/failure outcome as the dependent variable and group, phasic pupil dilation at the end of search interval, their interaction and an intercept factor as predictors. This allows us to determine which factors best account for the data. For the single-feature search condition, only the intercept factor was significant (Wald Statistic = 4.83, P < 0.001), in other words, performance was not significantly influenced by pupil dilation, group, or their interaction. The same was true for the easy, set size 5 feature-conjunction trials (W = 3.71, P < 0.001). However, in the difficult, set size 9/13 feature-conjunction trials, only the pupil dilation factor was significant (W = 2.78, P = 0.005). This indicates that while attentional focus varies from trial-to-trial throughout a block, it seems only beneficial on difficult trials. This is consistent with the psychophysical conceptualization of these tasks, as the single-feature conditions are classic examples of a pre-attentive task where finding the target is fast and effortless. The set size 5 condition, while a feature-conjunction search, is nearly as easy because of the low number of distractors. In contrast, in the relatively difficult set size 9 and 13 condition, effort matters and greater attentional focus increases the likelihood of success. As well, this pattern of results means we are not just seeing arousal at having been successful: if this response were a consequence of success, it should be present wherever there is success, not just in the difficult conditions. Consistent with that, the onset of the phasic response precedes the average time at which the target was found.

This analysis indicates that the phasic pupil response factor alone – our measure of the LC-NE system's modulation of attentional focus – is adequate to account for the visual search data; the most parsimonious model is one where performance is governed by attentional focus34. Isolating this factor, this relationship can be formalised in a single function for the difficult set size 9/13 condition (Fig. 3; as discussed above, the single-feature and low, set size 5 feature-conjunction conditions were so easy that attentional focus did not affect performance). Superimposed on this function are data points from quartiles based on pupil diameter at the end of the search interval; each data point reflects mean diameter and success rate for a quartile. To examine data at the tails, we plotted data from two outlying quantiles at either end as well. The model well represents the data and quantifies the relationship between attentional focus, as reflected in phasic pupil dilation and success rate in visual search (i.e., a 2 mm span of dilation is associated with a 66% span in performance).

Figure 3
figure 3

Relating attentional focus, as measured by phasic pupil response, to visual search performance.

Results of a binomial logistic model of search data (over both groups, for set size 9/13) using only phasic pupil response as a factor. Summary data is shown to visualise the quality of the fit. These data points reflect mean pupil dilation and performance taken over sets of trials quartiled based on pupil diameter (percentile boundaries of 0.25, 0.50, 0.75; black symbols; error bars reflect standard errors of the mean; about 50 trials per data point). Two outlying quantiles are shown to visualise the tails (percentiles of 0.025 and 0.975; light gray symbols; error bars reflect standard errors; 5 trials per data point).

Discussion

Pupillometry indicates that in toddlers with ASD, the LC-NE system is more frequently in a phasic state than in age-matched, TD controls. Given the causal link that has been established between the LC-NE system and attentional focus and task performance35, this implies that toddlers with ASD have an advantageous predisposition toward greater attentional focus. Part of why toddlers with ASD outperform controls is not because they search differently, per se, but simply because they search harder. The fact that LC-NE system response differs as early as 2 years of age suggests that a difference in attentional regulation is a primary component of ASD etiology and likely one of the early starting points in the cascade of symptom development36.

To lend further support, there should be some independent reason to expect that the LC-NE system in individuals with ASD is more prone to phasic activity. Recently, there has been work implicating the LC-NE system in ASD etiology in humans37 and animal models38. Further, there has been a suggestion that in ASD, the LC may indeed be in a persistent hyperphasic state26, thereby increasing performance on tasks that benefit from focussed attention and reduced distractibility, but decreasing performance on tasks that require shifts of attention39,40. Suggestively, administration of venlafaxine, which suppresses LC activity, is an effective treatment for some of the attention-related symptoms of ASD41.

The present results connect three ‘dots’ in the literature 1) that the LC-NE system is a modulator of attentional state and consequently task performance, 2) that the LC-NE system has been implicated in ASD etiology and 3) that the clinical profile of ASD includes intense attentional focus as well as resistance to disengagement to certain classes of stimuli42 that may be related to restricted behaviors and interests43. In a recent review of the literature of visual search and ASD11, we do find evidence for perceptual differences at play. Some studies appeal to ‘enhanced’ processing in early visual areas and/or biases in visual processing such as those proposed by Weak Central Coherence44,45 - which posits deficits in global processing that can be advantageous when task demands favor local processing (as in the embedded figures task; but see ref. 46). In light of the present results though, we suggest a framework for understanding ASD etiology and phenomenology that delays an appeal to differences in visual processing, per se, until more general factors, i.e. differences in focussed attention, have been taken into account. At the moment, applying this framework beyond the ASD advantage in visual search is speculative, but provides testable hypotheses for future research. In conclusion, our findings reveal that a significant factor in the enhanced visual skills of individuals with ASD does not stem from differences in perception, memory, or cognition – it arises instead from an inherent predisposition to focussed attention.

Method

Participants

Data discussed in this paper was collected during our previous study (see ref. 15). In that study, 17 toddlers with Autism Spectrum Disorder (ASD group mean age: 29.6 +/− 4.8 months, range: 21–35 months, 3 females) and 17 age-matched, typically developing toddlers (TD group mean age: 29.5 +/− 2.6 months, range: 25–34 months, 10 females) participated. All participants in the ASD group were recruited from an early intervention provider specializing in ASD services in the Greater Boston area (the two groups did not differ in maternal education or household income). Diagnosis was confirmed with the Autism Diagnostic Observation Schedule (ADOS; a standard test for assessing and diagnosing autism) by co-author Alice S. Carter, a clinical psychologist experienced in diagnosing children with ASD. 15 of the 17 ASD group participants met the stricter criteria for autism. Parents of each participant completed the Parent Interview for Autism - Clinical Version (PIA-CV) questionnaire47, to verify that none of the toddlers in the TD group were on the autism spectrum.

In terms of mental age, our ASD group scored significantly lower than the TD group on the Visual Reception, Receptive Language and Expressive Language subscales of the Mullen Scales of Early Learning48. The toddlers in the ASD group performed at the 20.9, 18.1, 16.5 months mental age level, while controls were at 35.3, 33.8, 31.1 months, respectively. As well, we calculated Mullen Early Learning Composite (ELC) Standard Scores for all participants (the ELC score is based on a combination of the Visual Reception, Receptive Language and Expressive Language sub-scores from the Mullen. ELC scores have a mean of 100 and a SD of 15). The TD group had a mean score of 116 (‘Average’/‘Above Average’) and the ASD group a mean score of 65 (‘Very Low’). (Please see ref. 15 for more details about participant characteristics).

Human subjects protections

All experiments were performed in accordance with relevant guidelines and regulations. All child participants were volunteered by their parents or legal guardians. Informed consent for participation was obtained from the parents or legal guardians. All materials and experimental procedures were reviewed and approved by the Institutional Review Board (IRB; Federal-Wide Assurance 00004634) of the University of Massachusetts Boston.

Stimuli, apparatus and procedure

We tailored our visual search paradigm for this age group and clinical population through the use of implicit cues, so no verbal instructions were needed and eye tracking, so no verbal or manual responses were needed. For instance, a reward animation where the target rotated back and forth for 2 s at the end of the search interval highlighted the target's special status and encouraged toddlers to search for it. We used a Tobii T120 eye tracker (Tobii Technology, Stockholm, Sweden) to measure eye movement patterns during search. Participants sat on their caregivers' lap to view the displays, approximately 70 cm away from the monitor. Caregivers wore occluding spectacles during testing. The red and blue apple-shaped items subtended 5 × 5 degrees of visual angle and the red oblong items subtended 7.3 × 1.25 deg. All items appeared within a 20 deg diameter virtual circle centered on fixation (Fig. 1). The positions of targets and distractors were randomised from trial to trial. On feature-conjunction search trials, only the conjunction of colour and shape (‘red’ and ‘apple-shaped’) distinguished the target from the same-colour (red oblong) or same-shape (blue apple) distractors. On single-feature search trials, the target was distinguished by virtue of its shape or colour from a homogeneous set of red oblong or blue apple (depending on trial) distractors, respectively.

Each participant ran an average of two blocks. Each block consisted of 17 trials and lasted approximately 4 minutes. The first four trials were familiarization trials where the three experimental stimuli (the red apple target, blue apple colour distractor and the red, oblong shape distractor) were presented for 3 s, in a different spatial arrangement on each trial. The next two trials were set size 5 single-feature trials. The subsequent 11 test trials consisted of two single-feature trials (set size 9) and nine feature-conjunction trials (set sizes 5, 9 and 13), mixed within the block. Except for familiarization, all trials began with the red apple target flying in from the upper portion of the screen, stopping at the center of the screen for 1 s and then fading out to reveal a fixation cross that was presented for approximately 1.5 s. It was from this quiet period with only the fixation point present, prior to the presentation of the search display, that baseline pupil diameters were measured (see Pupillometry below). The search display itself was presented for 4 s (the ‘search interval’). A ‘successful’ search trial meant that the participant fixated the target within the search interval. After this, the trial ended with an animation of the target spinning back and forth for approximately 2 s. The main behavioral results showed that the ASD group was successful 80%, 72% and 62% of the time for set sizes 5, 9 and 13 in feature-conjunction search, respectively, while the TD group achieved 70%, 49% and 26% performance. For single-feature search, performance was 88% and 80%, for the ASD and TD groups, respectively. For further details, see ref. 15.

Pupillometry

The Tobii T120 eye tracker collected pupil diameter at 60 Hz from each eye. The Tobii T series of eye trackers adjusts pupil measurements according to the distance between the eye and the sensor and the position of the eyes, to estimate the actual, external physical pupil size. Subsequent pupil data analyses were done using MATLAB scripts based on those of Sylvain Sirois21. Missing values from an eye were replaced with the corresponding value from the other eye; failing that, a linear interpolation was used. If a 12 second trace accumulated four or more seconds of missing data it was rejected; 36% and 42% of trials were rejected in this way for the TD and ASD groups, respectively. The values from the two eyes were then averaged and smoothed. Except for the analysis of tonic pupil dilation, traces were then baseline-corrected. For this, for each trace, an average of pupil diameter was taken over a 500 ms window before stimulus onset, during the 1.5 s quiet period between trials. This is a relatively short period, to keep the pace of the experiment brisk, but this means baseline levels will be higher than if the pupil were given more time to approach a resting state; previous work has found no interaction between the rate of this drop and ASD/TD group31. The average value in this window was defined as a zero point and phasic pupil dilations were then measured as deviations from this baseline level. This factors out individual differences in tonic pupil diameter, while also correcting for differences due to fluctuations in ambient light levels that may have accompanied a particular session. These though were largely consistent across sessions and should not affect phasic pupil dilation49. As well, the interaction between the luminance of display elements and gaze behavior can affect pupil dilation. However, as stated earlier, groups did not have significant differences in the number of fixations made to the search items, the total time spent dwelling on these items, nor the overall time spent looking at the screen15. Pupil responses over the search interval for individual trials were then averaged to produce curves for conditions and groups.