Abstract
Decision-making is an essential cognitive process by which we interact with the external world. However, attempts to understand the neural mechanisms of decision-making are limited by the current available animal models and the technologies that can be applied to them. Here, we build on the renewed interest in using tree shrews (Tupaia belangeri) in vision research and provide strong support for them as a model for studying visual perceptual decision-making. Tree shrews learned very quickly to perform a two-alternative forced choice contrast discrimination task, and they exhibited differences in response time distributions depending on the reward and punishment structure of the task. Specifically, they made occasional fast guesses when incorrect responses are punished by a constant increase in the interval between trials. This behavior was suppressed when faster incorrect responses were discouraged by longer intertrial intervals. By fitting the behavioral data with two variants of racing diffusion decision models, we found that the between-trial delay affected decision-making by modulating the drift rate of a time accumulator. Our results thus provide support for the existence of an internal process that is independent of the evidence accumulation in decision-making and lay a foundation for future mechanistic studies of perceptual decision-making using tree shrews.
Significance Statement
Despite decades of work in the field of decision-making, we still have no clear brain-wide model of how perceptual decisions are formed and executed. A major reason for this lack of understanding is the limited animal models in decision-making studies. Here, we have successfully established a rigorous perceptual decision-making paradigm in tree shrews, and evaluated their choice and response-time behaviors with both summary statistics and trial-level computational modeling. Our results suggest that an endogenously-driven decision process, in addition to standard stimulus-dependent evidence accumulation, is necessary for interpreting the observed behavior. Our study thus underscores the importance of characterizing additional factors that affect decisions and encourages future investigations using tree shrews to reveal the neural mechanisms underlying these cognitive processes.
Introduction
Decision-making is a vital cognitive process, playing an important role in many brain functions such as categorization, learning, memory, and reasoning. Among different forms of decision-making, perceptual decision-making, where decisions are based on sensory stimuli, is a simple yet informative task that is particularly amenable to experimental studies. Visual stimuli are often used because the visual system is arguably the best studied sensory system, thus advantageous for understanding perceptual decision-making from sensation to action.
Considering decision-making is a dynamic process with complex combinations of distinct underlying variables, researchers have frequently applied sequential sampling models (SSMs) to interpret and decompose decision behaviors. These models assume that the evidence (i.e., a variable depending on the sensory stimulus strength) is accumulated through time, and a corresponding choice is made when the accumulated evidence passes a threshold. By defining these stochastic accumulation processes, SSMs can simulate decisions and response times (RTs) with the stimulus as the input. The discovery of “ramping neurons” during decisions in many brain regions provides neural evidence for these models (Horwitz and Newsome, 1999; Roitman and Shadlen, 2002; Ding and Gold, 2010; Mante et al., 2013). Despite the models’ effectiveness in a wide range of applications, variants of the SSM make different predictions regarding what decision variables (bias, threshold, time perception, etc.) are involved and how they interact with each other (Ratcliff, 1978; Usher and McClelland, 2001; Brown and Heathcote, 2005; Cisek et al., 2009). More importantly, the neural mechanisms of these variables and their interactions remain largely unknown, which typically require studies in animal models.
Monkeys and rodents (mostly rats and mice) are commonly used in decision-making studies, with respective advantages and drawbacks. Monkeys are closely related to humans, but they are expensive and limited in availability, thus difficult to study or control individual differences. Furthermore, most modern “circuit-busting” opto-genetic and chemo-genetic techniques are not yet routinely used in primates. On the other hand, recent use of rodents, especially mice, has significantly advanced our understanding of decision-making (Odoemene et al., 2018; International Brain Laboratory et al., 2021; Ashwood et al., 2022). However, mice and rats are nocturnal animals with poor eyesight, making them less than ideal for visual tasks. In addition, rodents often learn visual tasks slowly (Aoki et al., 2017; Urai et al., 2021), costing both time and effort to obtain high quality data. Here, we use a different animal model, tree shrews (Tupaia belangeri; Fig. 1A) for visual decision studies. Under the order of Scandentia, tree shrews are evolutionarily closer to primates than rodents are (Yao, 2017). They are diurnal, have an excellent acuity, and display visual system complexity similar to primates (Petry and Bickford, 2019). Earlier studies have shown that they could be reliably trained to perform visual (color, orientation, spatial frequency, temporal frequency, etc.) discrimination tasks (Casagrande and Diamond, 1974; Petry et al., 1984; Petry and Kelly, 1991; Callahan and Petry, 2000; Mustafar et al., 2018). In addition, tree shrews are of lower cost, smaller, and have a faster reproduction cycle than monkeys, making them more accessible. Finally, modern viral, genetic, and imaging techniques are being applied in tree shrews with a better success than in primates (Lee et al., 2016; Li et al., 2017; Sedigh-Sarvestani et al., 2021; Savier et al., 2021). Taken together, tree shrews have the potential to advance the understanding of neural mechanisms underlying perceptual decision-making. In this study, we seek to establish a rigorous perceptual decision-making paradigm for tree shrews, and to characterize the decision-making features, including both response accuracy and response time, in this animal model quantitatively with both summary statistics and trial-level computational modeling.
Materials and Methods
Contrast discrimination task
We trained in total of nine (male = 7, female = 2) freely moving tree shrews to perform a two-alternative forced choice (2AFC) contrast discrimination task (Fig. 1C). At the beginning of each trial, a visual stimulus of two orthogonal overlapping α-transparent gabors appeared at the screen center to indicate that the tree shrew could lick the center port to initiate the trial. After initiation, the center stimulus disappeared, and two side gabor patches were presented immediately on the left and right of the screen. Tree shrews needed to choose the side with a higher contrast by licking the corresponding lick port. This self-initiation design helped to ensure that the animals were focused from the beginning of each trial and allowed us to record accurate RTs, which were calculated as the duration between the stimulus (two side gabors) appearance and the side-port lick detection. Once a choice lick was detected, the stimulus would disappear from the screen. We adopted a free-response structure that if no choice was detected, the stimulus would be on for an infinite amount of time.
Intertrial intervals (ITIs) were randomly drawn from a truncated normal distribution with a mean of 0.6, a SD of 1, a lower bound of 0.5, and an upper bound of 0.7 (unit: seconds). For correct responses, liquid reward (50% grape juice) was given right after the animals reported their choices. The reward volume was determined by the duration of the valve opening, which was randomly drawn from a truncated normal distribution with a mean of 0.1, a SD of 0.06, a lower bound of 0.2, and an upper bound of 0.4 (unit: seconds). The speed of liquid flow was 150 μl/s. The average reward volume in one correct trial was 33 μl (0.22 s). The random ITI and random reward duration helped the animals to stay engaged in the task.
For incorrect responses, two protocols were used to generate a delay as a punishment. (1) A fixed delay of 4 s was used in the first group of tree shrews for all incorrect responses. If the animal licked the center port during the delay (i.e., blank screen licks; detected in 0.8 s periods), a penalty of 0.8 s was then added to the delay, with a maximum of 8 s for the total delay. (2) An exponential decay function (Equation 1) was applied in the second group of animals to generate a between-trial delay based on the trial-level RT:
Animal training and data collection
Tree shrews were first acclimated to the behavior box for 1–2 d. For most animals (seven out of nine), water restriction started at this stage of training (stage 1). For the other two animals, water restriction started a couple of days before acclimation. Two approaches of water restriction were used: (1) we gradually reduced their water intake from baseline (20–40 ml/d) to 5–10 ml/d by limiting the availability of drinking water; (2) we used citric acid (CA; Urai et al., 2021) water in their home cage to reduce water intake and gradually increased its concentration from 2% to 4%. The progress of water restriction depended on the animals’ weight loss, water-intake baseline, and tolerance, to make sure that they were motivated to stay focused on the task for at least 25 min/d, and at the same time, not experiencing any health issue (weight ≥90% × baseline). Depending on the animals’ acclimation and learning speed, the water restriction progress (2–7 d) could extend to stage 2 and even 3 before reaching a stable restriction level.
During stage 1, a single gabor stimulus would be shown right above the center lick port. After the gabor appeared, the animals could lick the center port at any time to trigger a liquid reward (grape juice diluted with water in a 1:1 ratio). Each tree shrew was left in the behavior box to learn to use the center port for no more than 20 min every day for acclimation, but this stage usually took only 1 d (20–40 trials per day). Having learnt to get liquid reward from the center port, the animals progressed to the next stage. At stage 2, the contrast discrimination task was set up with contrast pairs of 1.0 (full contrast) versus 0.0 (zero contrast), i.e., a single side stimulus was shown. The goal of stage 2 was to train the animals to use the left and right lick ports. Liquid reward from the center port was gradually reduced to zero within ∼50 trials. Animals usually perform 100–300 trials per day at this stage. Once they learned and had a stable correct rate of >75%, they progressed to stage 3. Note that most animals learned very fast and graduated both stages 1 and 2 within 2 d.
At stage 3, we first gave the animals an easy condition by using contrast pairs of 1.0 versus 0.1, and gradually mixed in other pairs of smaller contrast differences, finally achieving the stimulus set we use in the formal data collection. During this stage of training, we also adjusted the ratio of easy (e.g., comparing the highest and lowest contrast) and difficult (same or similar contrast) trials for each animal. By including sufficient easy trials and limiting the number of equal-contrast trials, we were able to keep the animals motivated to keep doing the task. For equal contrast trials, the correct answer was randomly assigned to left or right, so that the animals still had 50% chance to get a reward in these trials. At this stage, the animals performed 500–600 trials per day. Some animals could finish it within 30 min, while some of the others needed as long as 1 h, especially when they produced large numbers of incorrect choices (giving rise to more penalty time) or they started to lose patience and focus (giving rise to more idling time). To control the frustration level, we would stop the training when the duration was over 1 h. At this time, some animals (50%) also developed biased behavior by making most choices to the same side. We discouraged this behavior by automatically adjusting the probability of left/right trials depending on their real-time performance. For example, we calculated the proportion of choosing rightward in the previous 10 trials, denoted as Pr. The probability of the next trial being rightward was 1 – Pr. This real-time bias correction quickly discouraged the biased behavior in the tree shrews.
After the animals achieved a stable (three to five consecutive days) overall accuracy ≥60% (at this time, the accuracy is expected to be lower because of the existence of equal contrast trials and other difficult trials), we collected data for consecutive days (500–600 trials per day) to reach at least 100 repeats for each condition of contrast discrimination. The data were first culled by applying a three SD outlier removal on the Box–Cox transformed response time distribution in preprocessing. The remaining trials were used in further analysis.
All animal procedures were performed in accordance with the University of Virginia animal care committee’s regulations.
Stimulus and apparatus
The experiment program was written in Python and the stimuli were generated and presented with the State Machine Interface Library for Experiments (SMILE; https://github.com/compmem/smile). The Gabor patch size was 28∘, and the spatial frequency was 0.2 cpd. The stimulus screen had a 1280 × 1024 resolution and 60-Hz refresh rate, and was γ-corrected. It was set at a distance of 15 cm from the animal. There were six levels of stimulus contrasts ranging from 0.08 to 0.99, which were evenly-spaced. All combinations of left and right contrasts are presented in a randomized order.
The lick-detector circuit (adapted from Marbach and Zador, 2017), and reward-valve control circuit (adapted from https://bc-robotics.com/tutorials/controlling-a-solenoid-valve-with-arduino/) were controlled with an NI USB-6001 multifunction I/O device (https://www.ni.com/en-us/support/model.usb-6001.html). The Plexiglas behavior box was L: 40 cm × W: 22 cm × H: 20 cm with a transparent window on the front side to allow the animals to watch the screen.
Data analysis and models
To test the relationship between RT and contrast difference, we fitted a mixed effect linear regression model with RT as the dependent variable, the absolute contrast difference between left and right stimuli as the independent variable, and individual animal as the group variable, using the statsmodels library in Python.
We fitted the behavioral data with two sequential sampling decision-making models, the timed racing diffusion model (TRDM) and the racing diffusion model (RDM), and compared their performance using a Bayesian approach. TRDM contains three independent accumulation processes, namely two evidence accumulators and one time accumulator (or “timer”), whereas RDM only has the two evidence accumulators. The probability density function (PDF) [f(t)] and cumulative distribution function [F(t)] for each evidence or time accumulation process are defined by the inverse Gaussian (Wald) distribution in Equation 3:
The mean drift rate (ρ) of each evidence accumulator was calculated using the following equation (Eq. 4), taking into consideration both the stimulus difference and their total strength:
The accumulators race against each other. If one of the evidence accumulators reaches the threshold first, a corresponding choice is made. If the time accumulator reaches the threshold first, one of the options will be chosen randomly with a partial dependence on which evidence is greater at that time point. This is done through a process controlled by a parameter γ, ranging from 0 to 1, with 1 being fully dependent on the evidence accumulated up until that point, and 0 being completely random regardless of the accumulated evidence. Other parameters of the model include ρt, ω, and t0, as described in Table 1.
Extended Data Table 1-1
Statistical table. Download Table 1-1, DOC file.
Extended Data Table 1-2
TRDM best fitting parameters of each animal. Download Table 1-2, DOC file.
Extended Data Table 1-3
RDM best fitting parameters of each animal. Download Table 1-3, DOC file.
To apply Bayesian inference, we first defined the “priors,” the belief of the true parameter values before data observation, by assigning a probability distribution for each of the parameters based on previous experience (Table 1; Kirkpatrick et al., 2021). We then used the observed data to update the prior distributions, to achieve a more constrained posterior distribution of what parameters could have generated the observed data for each model. Posterior samples were generated with the differential evolution Markov chain Monte Carlo (DE-MCMC; Ter Braak, 2006; Turner and Sederberg, 2012; Turner et al., 2013) algorithm, which was shown to be computationally efficient. This was implemented by the RunDEMC library (https://github.com/compmem/RunDEMC). We set 10k (k is the number of parameters) parallel chains for 200 iterations in the burn-in stage and 500 iterations to sample the posterior.
Specifically, we apply a standard Metropolis–Hastings algorithm to accept or reject proposed samples from the posterior. Here, a new parameter proposal is evaluated by comparing its posterior probability with that of the current proposal, with the probability of accepting a new proposal:
To calculate the likelihood
Finally, to compare the performance of the two models, we first calculated the Bayesian information criterion (BIC) values (Eq. 7) of each model fitting result:
Code accessibility
Python code for preprocessing and running TRDM/RDM models are included in the Extended Data 1.
Extended Data 1
Code for analysis and modeling. fit_rdm.py fit_trdm.py single_animal_preprocessing.ipynb waldrace.py Download Extended Data 1, ZIP file.
Results
Tree shrews quickly learned to perform a contrast discrimination 2AFC task
We trained a total of nine (male = 7, female = 2) tree shrews to perform a 2AFC contrast discrimination task (Fig. 1). The 2AFC design was chosen over other classic paradigms such as “Go/no-Go” tasks because it eliminates the asymmetry between responses for different options. Also, we designed the trials to be self-initiated and self-paced by the animals, to obtain precise response time (RT) data for comprehensive behavioral analysis. During training, freely moving tree shrews were first acclimated in the behavioral box with a single gabor stimulus appearing at the center or either side of the screen (Fig. 1B). After the animals learned the association between the stimulus and liquid reward, often within 1–2 d, two gabors of different contrasts were introduced with the higher contrast one indicating the location of the reward (Fig. 1C). All the tree shrews were able to learn the task and reach an accuracy >75% for the easiest condition within one week (Fig. 1D). In fact, most of them reached 75% accuracy within 2 d. It is worth noting that, once the animals reached a good performance, the overall difficulty was increased progressively. In other words, the “easiest” condition often became more difficult in successive days. However, the animals’ performance was stably above 75%, indicating that they had learned the rule of the task, instead of the specific stimuli, within a very short period. These observations thus highlight the impressive learning capability of tree shrews and indicate that they can be a promising animal model in cognitive neuroscience research.
Tree shrews showed different behaviors under two training schemes
In the first group of animals (n = 5; male = 4, female = 1), a fixed trial delay of 4 s was used to punish incorrect responses (Fig. 2A). All animals were able to learn the task. An increase in difficulty (i.e., a decrease of contrast difference between the two stimuli) induced an expected drop of response accuracy (Fig. 2B). However, task difficulty did not have a significant effect on the response time (RT) in correct trials (mixed effect linear regression, β = 0.008a, p = 0.125; Extended Data Table 1-1), whereas the RT in incorrect trials increased with task difficulty (Fig. 2C, mixed effect linear regression, β = –0.075b, p < 0.001). This result is different from previously reported RT trend in humans, monkeys, and mice (Roitman and Shadlen, 2002; Palmer et al., 2005; Philiastides et al., 2011; Dmochowski and Norcia, 2015; Jun et al., 2021; Orsolic et al., 2021), where increasing task difficulty usually resulted in an increase in RT in correct trials. We examined the RT distribution of individual animals and saw a bimodal-like shape in most animals (n = 4 out of 5) in this group (Fig. 2D; Extended Data Fig. 2-1), instead of the more common log-normal distribution (Ratcliff, 1978; Smith and Ratcliff, 2004). Furthermore, the first small peak of the RT distribution contained a similar proportion of correct and incorrect trials, while the second peak had many more correct than incorrect trials. This bimodal distribution suggested two possible modes in the behavioral responses, a “fast-guessing” mode of random performance and a slower mode where an animal was more “engaged” in the task.
Extended Data Figure 2-1
Response time distributions of the individual animals from the fixed-delay group. Download Figure 2-1, TIF file.
To discourage the animals from “fast guessing,” we employed an exponential decay trial delay for incorrect responses in the second group (n = 4; male = 3, female = 1; Fig. 2E). The exponential decay delay would punish fast incorrect responses more than slow incorrect ones, at a more aggressive level than the fixed trial delay procedure (Fig. 2A,E). All animals in this group were again able to learn the task quickly (Fig. 2F,G). Notably, the overall RT was substantially slower compared with the fixed-delay group, indicating the effectiveness of the new trial delay paradigm. Furthermore, the RTs in correct trials showed a slightly increasing trend with task difficulty (mixed effect linear regression, β = –0.021c, p = 0.001), while the effect on the incorrect RT became less prominent than for the fixed-delay group (mixed effect linear regression, β = –0.046d, p = 0.014). When examining the RT distribution of individual animals, we saw one-peak log-normal distributions, similar to what was reported in other species, and a clear above-chance accuracy across the entire range (Fig. 2H; Extended Data Fig. 2-2). These behavioral data thus demonstrate that the tree shrews responded to the two trial delay schemes with different behaviors.
Extended Data Figure 2-2
Response time distributions of the individual animals from the exponential-delay group. Download Figure 2-2, TIF file.
Non-evidence-accumulation mechanism is crucial to interpreting tree shrew behaviors
The above behavioral data suggest the involvement of a process in addition to evidence collection during decision-making. One possibility is a time accumulation process where the animals had an internal time threshold on the task, and they would rush into a more or less random choice if the time threshold was reached before accumulating enough evidence to guide the choice. This time limit would be different under the two trial delay paradigms: shorter under fixed delay, thus leading to many fast guesses. To test the plausibility of this explanation, we turned to cognitive models of decision-making.
We fitted two models, racing diffusion model (RDM) and timed racing diffusion model (TRDM; Hawkins and Heathcote, 2021), to the data obtained from individual animals. In a 2AFC task, the RDM describes 2 independent evidence accumulators racing against each other. When one of the accumulators first reaches the threshold, a corresponding choice is made (Fig. 3A). The TRDM has one additional accumulator that tracks time (Fig. 3B). If the time accumulator reaches the threshold before the evidence accumulators, a decision is made based on the current accumulated evidence with a certain probability γ. We fixed all the accumulation thresholds to be 1. A fast time accumulator was thus effectively equal to a short time limit as described above. The two models allowed us to test whether an additional timing mechanism can better explain tree shrew decision behaviors.
We used a Bayesian approach for model fitting (Ter Braak, 2006; Turner and Sederberg, 2012; Turner et al., 2013), and then simulated choice and RT data with the best fitting parameters to visualize the goodness of fit. We found that the RDM captured the RT distribution of the exponential-delay group well, but failed to fit the fixed-delay group (Fig. 3C,D, top panels). On the other hand, the TRDM fitted well to both groups (Fig. 3C,D, bottom panels). To quantify their performance difference, we estimated the Bayes factor (BF) of the two models for each animal (Fig. 3E). For animals in the fixed-delay group, the values of ln(BF) were extremely high, ranging from 45 to 1062, providing overwhelming support for the TRDM. These values were much higher than 5, which is a conventional threshold for “very strong” evidence for one model over the other in Bayesian modeling (Lodewyckx et al., 2011). For the exponential-delay group, the evidence favored the RDM for three out of the four tree shrews, although the magnitude of evidence was not nearly as strong [ln(BF) ranging from −6 to 1]. It should be noted that Bayes factor in our estimation punishes complex models that have more parameters. As a result, despite the similar performance of the two models in fitting the exponential-delay group data, the RDM had the advantage of simplicity, thus leading to the winning BF.
We then simulated choice and RT data with the best fitting parameters (Extended Data Tables 1-2 and 1-3) for each animal using the winning model, to visually check the goodness of fit. Figure 4 illustrates that the TRDM fit the data of the fixed-delay group well (Fig. 4A), and the RDM was able to reproduce the behavior of the exponential-delay group (Fig. 4D), for both the psychometric curves and the RT-contrast relationship. Consistent with the result in Figure 3, the TRDM was also able to fit the psychometric curves and the RT-contrast relationship for the exponential-delay group (Fig. 4C), similarly to the RDM, while the RDM failed to capture the RT-contrast relationship for the fixed-delay group (Fig. 4B). The fact that the behavior of both groups could be explained by the TRDM supported the involvement of the non-evidence-accumulation process during tree shrew visual decision-making, and this process can be manipulated by applying different trial delay rules.
The models allowed us to track down the generating mechanism of the simulated data, i.e., whether each decision was initiated by an evidence accumulator or the timer crossing the threshold. We separated the TRDM-simulated data for each animal according to the generating mechanism, and found the timer and evidence accumulators contributed to two separate RT peaks. Extended Data Figure 4-1 shows the comparison between simulated data and observed data for an example tree shrew from the fixed-delay group (Fig. 2D). The results indicated that the fast RTs were largely generated by the timer (Extended Data Fig. 4-1A). In addition, when examining the simulated RTs for correct choices generated by evidence accumulators only, they increased with the task difficulty (Extended Data Fig. 4-1D), similar to what has been previously reported in humans, monkeys, and mice (Roitman and Shadlen, 2002; Palmer et al., 2005; Philiastides et al., 2011; Dmochowski and Norcia, 2015; Jun et al., 2021; Orsolic et al., 2021). These model results suggest that the tree shrews learned the visual decision-making task, and they had similar behaviors as other animals when “engaged” in the task. Moreover, the timer-driven random choices explained the plateau of a nonperfect accuracy, even in the easiest conditions (Extended Data Fig. 4-1C).
Extended Data Figure 4-1
Decomposition of an example animal’s simulated RT distribution by the TRDM. A, The simulated RTs for one example animal (TS085) from the first group are divided into four groups: evidence accumulator generated RT for correct (blue) and incorrect (pink) responses, and time accumulator generated RT for correct (green) and incorrect (yellow) choices. Compared with the observed data (B), the plots show that the TRDM interprets the first peak (fast RT) in the RT distribution as generated by the time accumulator. C, Simulated psychometric curves generated by the evidence accumulators and the time accumulator. D, Evidence accumulator simulated RT as a function of contrast difference. Download Figure 4-1, TIF file.
Next, for each tree shrew, we quantified the percentage of timer-induced choices from the TRDM-simulated data (Fig. 4E). As expected from the above analysis, all of the animals from the fixed-delay group showed many timer-induced choices (ranging from 30% to 66%), while the value was near zero for every animal in the exponential-delay group. To understand what decision variables were altered by the change of delay rule, we examined the posterior distribution of the parameters in the TRDM. The posteriors of the timer-related parameters showed a general trend of higher mean drift rate for the time accumulator (ρt) and higher time drift rate variability (ηt) in the fixed-delay group than in the exponential-delay group (Fig. 4F,G). The two parameters work together to determine the accumulation speed of time during decision-making, with the fixed-delay group having faster timers. The model results therefore proposed a possible mechanism that the exponential delay worked by slowing down the time accumulation process in the tree shrews, which resulted in far fewer “timer-induced” fast responses with compromised accuracy, and more correct responses guided by the evidence accumulation process.
Discussion
In this study, we aimed to and succeeded in establishing a response-time paradigm of perceptual decision-making for tree shrews. The behavioral results showed that tree shrews are able to perform a contrast-discrimination perceptual decision task and generate informative choice and response time data. Model-based analyses suggest that, other than the choice-related evidence accumulation process, additional mechanisms, presumably mechanisms that keep track of time, are involved in the decision-making process depending on the specific design of trial delay because of incorrect responses. This new animal model will facilitate future decision-making studies with fast learning, reliable behaviors, increased availability, and more modern techniques.
We carefully considered two points when designing the behavioral paradigm. First, we adopted a 2AFC framework, where two alternative options match symmetrically with two response targets. In other widely used tasks, there often exists asymmetry in either responses or stimulus categories, which can be problematic when interpreting different behaviors. For example, Go/no-Go tasks involve an action (“go”) and a suppression of action (“no-go”) as two responses, which are likely driven by different neural circuits. Such tasks have thus become more suitable for studying impulsion and inhibition (Eagle et al., 2008; Dong et al., 2010; na Ding et al., 2014). On the other hand, yes/no tasks offer two asymmetric stimulus categories as options, which are likely represented differently at the neural level (Wentura, 2000; Donner et al., 2009). In comparison, a multiple alternative forced choice framework is better in perceptual decision-making studies. Second, we designed the task to be self-initiated and self-paced by the animals. Self-initiation ensures that the animals are focused during the stimulus presentation, and self-pacing encourages them to respond without delay once they reach a decision. Compared with the commonly-used design where the stimuli show up automatically and animals can respond at any time point within a fixed response window, our design allowed us to collect precise response times in addition to choice data. Response times are particularly useful because they are continuous (whereas choice data are discrete) and are more informative when characterizing decision behaviors. For example, fast correct responses have potentially different mechanisms from slow correct responses, which would be impossible to study without the RT information.
We used models under the SSM family to fit tree shrew decision behaviors on the trial level. SSMs predict the choice and RT distribution with a mathematically defined dynamic decision-making process controlled by cognitively meaningful parameters and offer testable hypotheses about the underlying mechanisms. Signal detection models have also been used to explain perceptual decision-making behaviors (Newsome et al., 1989), but they only predict the choices made by subjects in a decision process, ignoring the information contained in the response time. Furthermore, the choice data are usually averaged over trials, further reducing the information present in the raw data. By comparison, SSMs have the advantage of maximizing the efficiency of the animal experiments and data analysis (Ratcliff et al., 2003).
Despite the RDM showing a slightly better Bayes factor than the TRDM in the exponential-delay group because of simplicity, the TRDM had the same ability to reproduce the observed choice and RT pattern. Together with its overwhelmingly better performance in the fixed-delay group, the TRDM was overall the better model for this dataset. By examining the source of the simulated data (Extended Data Fig. 4-1), we found that timer-induced random choices largely contribute to the plateau of a nonperfect accuracy in the easiest conditions. Canonically, this nonperfect accuracy is modeled by “lapse rate” under the Signal Detection framework (Wichmann and Hill, 2001; International Brain Laboratory et al., 2021; Wang et al., 2020; Prins, 2012). The lapses are usually assumed to happen via a Bernoulli process, i.e., the animals simply make guesses at some random rate independently from trial to trial, while providing no detailed process of choice generation. In comparison, the TRDM utilizes a time accumulator that is highly similar to evidence accumulation to generate random choices. It offers a more integrative solution to the interaction between evidence-based and stimulus independent mechanisms. This can be more plausible on the neuronal level than two separate processes that involve very different calculations. In addition, the TRDM provides the extra ability to explain why we rarely see extremely long RTs in the difficult conditions, especially in the equal-evidence conditions. The time accumulator can limit the RT so that the decision-makers do not waste too much time on a single decision when the evidence is obscure. Thus, we think that the TRDM has more explanatory power than models that include a “lapse rate.” Furthermore, a recent study showed that mice alternate between states, such as lapse or biased decisions, during a perceptual decision-making task, and they have a higher probability to stay in the same state for consecutive trials (Ashwood et al., 2022). Therefore, Bernoulli “lapses” would be an oversimplified explanation of how nonperfect choices happen. In future studies, the temporal sequence of choices and RTs should also be analyzed to further investigate the mechanism of decision state switching.
Finally, it is intriguing that the tree shrews in this study showed a fair amount of premature choices under fixed trial-delay although this strategy was suboptimal, in that it did not maximize the reward rate. The TRDM suggested that the animals actively applied a fast timer (or a short time limit) on the task without being trained to perform the task speedily. Interestingly, this tendency of rushing into choices was discouraged by the exponential trial-delay design that specifically punished fast incorrect responses more. The baseline suboptimal behavior could partly be because of (1) the characteristics of this animal model and/or (2) the stimulus design. The tree shrews showed much faster responses compared with humans on similar tasks (Kirkpatrick et al., 2021). They were very nimble and showed swift movements and reactions in various environments (behavior rig, home cage, nature, etc.). Given their motor capabilities, fast responses could be a survival strategy to guarantee the total amount of reward via high sampling frequency with slightly compromised accuracy, and could be broadly used in most scenarios to facilitate “exploration” behaviors, unless specifically discouraged. Additionally, in previous perceptual decision-making studies, stochastic stimuli with motion such as random dot kinematogram were usually used (Roitman and Shadlen, 2002; Ditterich, 2006; Resulaj et al., 2009). These stimuli require temporal integration to acquire evidence for choices. In our study, we used the static feature (contrast) as evidence. Although studies showed support for evidence accumulation even using the static stimuli in other species (Kirkpatrick et al., 2021), temporal integration might not be needed as strongly to generate a choice under this situation. This could result in short response times, leading the animals to a faster RT regime (more prone to make premature choices) and masking the effect of task difficulty on the RT (Fig. 2G, minor effect, although significant). Nevertheless, the tree shrew data emphasized the natural existence of f evidence-independent mechanisms in decision-making and offered an opportunity to examine their effects. These behavioral patterns also suggest that we should consider the involvement of processes in addition to the evidence accumulation process in other animal/human models when interpreting both behavioral and neural data from decision-making tasks. Here, we included an independent time accumulator to implement this additional process in our decision-making models (Hawkins and Heathcote, 2021). However, it should be noted that mechanisms other than the time accumulator could also generate the fast-guessing responses and our results do not rule out these possible mechanisms. In other words, the time accumulator was not necessarily the true underlying mechanism, but rather a piece of evidence for the involvement of multiple generative processes for decision instead of one single process. Other studies have indeed applied alternative approaches to account for decisions not entirely based on evidence accumulation, such as combining the decision process with a probabilistic fast-guess mode that generates a normally distributed guessing time (Ratcliff and Kang, 2021). Future studies that incorporate neural data will be needed to reveal exactly how response times in perceptual decision tasks are affected by information other than the sensory strength.
Acknowledgments
Acknowledgments: We thank Masashi Kawasaki for the help with making the behavior box, Seiji Tanabe for advice on building electronic circuits of the lick port system and animal training, Michele Basso and Vaibhav Thakur for the helpful discussion and suggestions, Alev Erisir for sharing and coordinating tree shrew usage, Brandon Jacques for technical assistance and feature addition of the SMILE library, and Ryan Kirkpatrick for advice on modeling procedure.
Footnotes
The authors declare no competing financial interests.
This work was supported by the National Institutes of Health Grant 1UF1NS122040 (to J.C. and P.B.S.) and the Jefferson Scholars Foundation (to J.C.).
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.