Evidence for Distinct Forms of Compulsivity in the SAPAP3 Mutant-Mouse Model for Obsessive-Compulsive Disorder

Abstract The specific mechanisms underlying compulsive behavior in obsessive-compulsive disorder (OCD) are unknown. It has been suggested that such compulsivity may have its origin in cognitive dysfunction such as impaired processing of feedback information, received after the completion of goal-directed actions. The signal attenuation (SA) task models such a processing deficit in animals by attenuating the association strength between food reward and audiovisual feedback (signal) presented after performance of an operant response. The compulsive-like responding resulting from SA is well characterized in rats, but was so far not established in mice, a species for which powerful genetic OCD models exist. Thus, first, we demonstrate that the SA task can be implemented in mice and show that attenuation of reward-associated response feedback produces similar behavior in C57BL/6 mice as previously reported in rats. Second, we tested the hypothesis that SAPAP3 knock-out mice (SAPAP3-/-), prone to exhibit several OCD-like abnormalities including excessive grooming, show enhanced compulsive-like behavior in the SA task compared with their wild-type (WT) littermates. However, task-related compulsivity measures in SAPAP3-/- and WT did not yield significant differences, neither following SA nor during “regular” extinction of operant behavior. Thus, compulsive-like instrumental behavior following feedback distortion was not potentiated in compulsively grooming mice, implicating specifically that (1) a general deficit in feedback processing is not related to excessive grooming in SAPAP3-/- and (2) different manifestations of compulsivity may be driven by independent mechanisms.


Introduction
Compulsive behavior is driven by the urge to perform repetitive actions in a rigid or stereotyped manner and by the experience of limited voluntary control over such an urge, including a diminished ability to delay or inhibit these behaviors (Denys, 2011). Compulsive behavior can be observed in a number of neurodegenerative and psychiatric disorders (Berlin and Hollander, 2014). In obsessive-compulsive disorder (OCD), patients have recurring, unwanted thoughts (obsessions) that make them feel driven to act (compulsions), often with the intention to prevent dreaded events or situations (Luigjes et al., 2019), despite insight into how unreasonable and inappropriate this behavior is (American Psychiatric Association, 2013). The specific processes underlying compulsive behavior are still unknown, but various hypotheses about impairment of cognitive functions have been put forward, such as a deficiency of feedback processing (Otto, 1992;Nielen et al., 2009), cognitive inflexibility (Chamberlain et al., 2006), reduced behavioral inhibition (Chamberlain et al., 2005;Morein-Zamir et al., 2010), imbalance between goal-directed and habitual behavior (Gillan et al., 2011), emotional dysregulation (Steketee et al., 1996;Szechtman and Woody, 2004;Endrass et al., 2011;Kaczkurkin and Lissek, 2013), or intolerance to uncertainty (Reuther et al., 2013). Animal models for OCD offer the possibility to study these cognitive impairments separately (Albelda and Joel, 2012;Camilla d'Angelo et al., 2014;Szechtman et al., 2017).
In the present study, we investigated whether impaired processing of external feedback underlies compulsive behavior in a genetic mouse model for OCD, the SAPAP3 knock-out mouse (SAPAP3 -/-). SAPAP3 -/self-groom excessively and display increased anxiety and decreased behavioral flexibility (Welch et al., 2007;Manning et al., 2019;van den Boom et al., 2019). Compulsive-like grooming aggravates during aging and may continue to the point that the animals develop grooming-induced facial hair loss and skin lesions. This excessive self-grooming bears similarity to symptoms such as compulsive hand-washing observed in OCD patients, hair-pulling in trichotillomania patients, or nail-biting in onychophagia patients (Welch et al., 2007;Yang and Lu, 2011). Similar to OCD patients, compulsive grooming can be normalized by administration of selective serotonin reuptake inhibitors (SSRIs) or deep-brain stimulation (Welch et al., 2007;Pinhal et al., 2018).
Impaired feedback processing has been modeled in the signal attenuation (SA) task, developed for rats by Joel and colleagues (Joel and Avisar, 2001;Joel, 2006;Albelda and Joel, 2012). This task is based on the assumption that compulsive behavior can be caused by deficient processing of environmental cues that signal the completion of goal-directed behavior. In this sense, such external response feedback resembles characteristics of perceptual signals, but not internal reference or error signals described in cybernetic models (Pitman, 1987). In the SA task, animals learn that an operant response leads to the delivery of food pellets and that an audiovisual signal provides response feedback. To simulate feedback deficiency experimentally, the incentive salience of this signal is attenuated by presenting it in absence of food delivery. This leads to compulsive-like responding (in a subsequent extinction test) that is absent in animals not exposed to this SA treatment and may resemble repetitive, inappropriate, and compulsive behavior that OCD patients are unable to suppress (Joel, 2006). This notion is supported by a decrease in compulsive responding in SA-exposed animals following interventions with treatments effective in OCD (Joel, 2006). Similar brain circuits are thought to underlie compulsive states induced by SA and by genetic deletion of the SAPAP3 protein. Inactivation of the lateral orbitofrontal cortex (lOFC) potentiated (Joel et al., 2005a,b) or induced (Joel and Klavir, 2006) compulsive lever-pressing in an SA task, whereas SAPAP3 -/show abnormalities in lOFC neuronal activity and perturbed cortico-striatal network activation (Lei et al., 2019). Moreover, stimulation of the lOFC-striatal pathway alleviates excessive grooming (Burguière et al., 2013). Notably, such dysfunction seems to be restricted to cortico-striatal circuits, and not extend to thalamo-cortical circuits (Wan et al., 2014). However, the involvement of cortico-striatal pathways other than projections from the lOFC need further investigation. For example, striatal input from the secondary motor cortex, which is strengthened in SAPAP3 -/- (Corbit et al., 2019), has not yet been tested in SA.
Under the assumptions that a general deficit in feedback processing is a major source of compulsive behavior (via a shared underlying neuronal pathology) and that compulsivity is a unitary and uniform phenomenon, then compulsivity in the SA task should be exacerbated in animal models for OCD (which already display compulsivity before SA induction of compulsivity). Therefore, we subjected SAPAP3 -/-(genetic OCD model) to the SA task, hypothesizing that SAPAP3 -/with SA-induced feedback deficiency would show more compulsive responding than normal wild-type (WT) control mice, comparable to the finding of Sesia et al. (2013) that revealed enhanced compulsivity in the SA task after repeated quinpirole administration (pharmacological OCD model). Alternatively, a variety of neural mechanisms might independently cause qualitatively different forms of compulsivity that do not potentiate each other. In this case, SA-induced compulsivity would not differ between SAPAP3 -/and WT. To test these hypotheses, we first implemented and validated the SA task, previously exclusively used in rats, in C57BL/6 mice (experiment 1). In a second step, we trained SAPAP3 -/in this task and compared their behavior to that of WT mice (experiment 2). Furthermore, self-grooming of SAPAP3 -/was scored in different environmental contexts.

Methods and Materials Subjects
To validate the SA task in mice, 24 C57BL/6JRccHsd, male mice were obtained from Harlan (experiment 1). To test the hypothesis that impaired feedback would underlie compulsive behavior, SAPAP3 -/-(bred on a C57BL/6J background; founders provided by Dr. Guoping Feng, Massachusetts Institute of Technology) and their WT littermates were bred in house. 34 SAPAP3 -/-(17 male, 17 female) and 35 WT (18 male, 17 female) were included in this study (experiment 2). At the start of behavioral training, all animals were three to four months of age, individually housed in an environment with reversed day-night cycle (12/12 h dark/light), controlled temperature, and humidity. Training and testing were performed in the animals' active period. All mice were food-restricted to a target weight of 90% of their individual ad-libitum weight, while water intake was ad libitum. Weight and health of the animals were monitored on a daily basis and special attention was paid to the formation of lesions in SAPAP3 -/-. Before experimental training, all mice were handled for three consecutive days (Hurst and West, 2010). Apparatus, procedure, and statistical analysis were identical for experiments 1 and 2. All animal procedures were performed in accordance with the Dutch law and the Royal Netherlands Academy of Arts and Sciences animal care committee's regulations.

Apparatus
Behavioral training and testing was conducted in standard operant boxes (Med-Associates), housed within sound-attenuated chambers. Each box was equipped with a food magazine on one wall and a house light (3 W, 24 V) on the opposite side. In some sessions, nose-poke holes and signaling lights were installed next to the food magazine. Food magazine and nose-poke holes contained infra-red beams for the detection of the animals' responses. The food magazine could be illuminated by a 3-W light and was connected to a pump with a syringe that delivered bouts of 20 ml of 20% sucrose solution. In addition, a speaker was attached to each chamber that produced tones with 80 dB and 2.8 kHz. Animal behavior was videotaped. All task programming and data acquisition was performed with Med-PC-IV software (Med-Associates).

Procedure
The experimental design of the SA task consisted of four consecutive stages and was based on the rat SA task (Joel and Avisar, 2001;Joel, 2006; Fig. 1).

Stage 1: shaping and magazine (Mag) training
During the initial two shaping sessions, animals were placed in the operant chamber for 30 min to adapt to the new environment and to find the site of reinforcement (food magazine). The house light and the magazine light were illuminated and the food magazine was filled with sucrose solution. In the subsequent Mag training, mice learned to associate a compound signal (tone and magazine light) with delivery of the sucrose reward into the food magazine (i.e., feedback signal). Each trial started with an intertrial interval (ITI) of 30-40 s, followed by delivery of the reward and simultaneous presentation of the feedback signal. A trial ended after the animal entered the food magazine (after signal "on") or 15 s elapsed. Both conditions caused the feedback signal to turn off. Each mouse was required to collect 20 out of 30 possible rewards in two successive sessions to proceed to training stage 2. During shaping and Mag training, nose-poke holes were not installed.

Stage 2: nose-poke training
In this training stage, mice learned to make a nose-poke response into a hole placed to the left or right of the food magazine. One poke hole was designated the correct hole and poking into this hole initiated reward delivery, whereas a response in the other hole was never rewarded. The side of correct and incorrect holes was counterbalanced across animals. Each training trial started with an ITI of 30-40 s. Thereafter, a cue light indicated that a response was required. When the animal poked in the correct hole, a reward was delivered in the food magazine signaled by simultaneous presentation of the feedback signal. Thus, the compound signal provided feedback about the completed action, the nose poke caused a reward delivery into the food magazine. When the animal collected the reward within the 10 s of the start of feedback signal presentation, this was recorded as a completed trial (CT) and the feedback signal turned off. A failure to collect the reward during the feedback signal presentation was recorded as an uncompleted trial (UCT). In this case, the cue light and the feedback signal turned off after 10 s. No-poke trials were trials in which the animal did not respond in the correct poke hole, regardless of responses in the incorrect poke hole. These trials were terminated after 15 s without presentation of the feedback signal. Nose-poke responses that were performed in addition to the required initial poke, were recorded as extra nose-pokes (ENPs) and were never rewarded. Training was conducted in two training stages. First, animals were required to collect at least 24 out of 30 possible rewards in CTs in two successive training sessions. Thereafter, the final nose-poke training stage, consisting of 50 trials, required successful completion of 34 CT trials.

Stage 3: signal attenuation (SA)
After nose-poke training, all animals were randomly assigned to either the SA or the regular extinction (RE) condition. For SA, the nose-poke holes were covered with metal plates and the feedback signal was presented in 30 trials without being paired with reward. The number of entries into the food magazine was recorded to provide information about the attenuation process. Each animal in the SA condition received three sessions of SA, with a maximum of two sessions per day. Animals assigned to the RE condition continued from nose-poke training (stage 2) directly to the final extinction test (stage 4), without any additional training.

Stage 4: extinction test
The final test consisted of a single session of 50 trials. For this test, the nose-poke holes were installed, but responding only led to presentation of the feedback signal of maximal 10 s but not to delivery of the sucrose reward (extinction conditions). In this extinction test, CTs, UCTs, and ENPs in completed (ENP in CT) and uncompleted (ENP in UCT) trials were measured.

Grooming and anxiety
Grooming was measured for SAPAP3 -/and WT. Behavior was recorded for 1 h in an open field (OF), a Plexiglas box (30 Â 30 Â 40 cm). An automated procedure of behavioral scoring was used to identify episodes of self-grooming and to determine locomotion (Van den Boom et al., 2017). A trained experimenter, who was blinded for genetic background of the animals, manually analyzed levels of self-grooming during the final extinction test. Anxiety-like behavior was assessed before the SA task on an elevated plus maze (EPM; 53 cm above the floor) with two closed (walls: width 4.5 cm, length 30 cm, height 15 cm) and two open arms (same dimensions but without walls). Animals were placed at the center of the maze. Time spent in open, closed, and center areas over a period of 5 min, as well as frequency of entries into these areas, were analyzed.

Data processing
Data acquisition was performed with Med-PC-IV and preprocessed with MATLAB. Statistical analysis was conducted with IBM SPSS software. In-depth data analysis was performed for the last session of nose-poke training and the final extinction test. Dependent variables were the numbers of trials, specified as CTs, UCTs, and no-poke trials, as well as the numbers of nose-pokes, specified as the number of ENPs in CTs (ENP-CT) and in UCTs (ENP-UCT). Compulsive nose-poking was operationally defined as the number of ENP-UCTs (Joel, 2006). As data generally deviated from the assumption of normality, for all analyses non-parametric testing was used. Repeated measures analysis was performed with a Friedman test, Figure 1. Experimental design of the mouse SA task. Training and testing procedures are based on the original rat version of the SA task (Joel and Avisar, 2001;Joel, 2006). At the beginning of the task animals learn to associate delivery of a food reward with an audiovisual feedback signal that indicates that food is available in the food magazine (stage 1, Mag training). Thereafter, animals learn that making a nose poke during illumination of a cue light leads to delivery of a food reward, accompanied by the signal (stage 2, nose-poke training; NP). Subsequently, half of the animals are exposed to the SA procedure (stage 3, SA) in which the information value of the signal is decreased by presenting the signal without food reward (simulation of feedback deficiency). Finally, all animals are tested under extinction conditions (stage 4, extinction test). Compulsivity measures (UCTs, ENPs) will be compared between animals that were exposed to SA prior and those that were not (experienced the extinction test only, i.e., RE without prior manipulation of the feedback signal). followed by Wilcoxon signed-rank tests. Kruskal-Wallis H tests provided the x 2 statistic for analyzing non-parametric data with more than two independent samples, while Mann-Whitney U tests were employed for comparison of two independent samples. In case data were normally distributed repeated-measures ANOVA was employed. The threshold for statistical significance was set at p , 0.05. Results are presented as mean 6 SEM. To compare compulsivity of SAPAP3 -/and WT, genotypes were matched on task performance, resulting in exclusion of three SAPAP3 -/-(two SA, one RE) and five WT (one SA, four RE) from the final analysis due to decreased learning performance or equipment failure. Therefore, we report data of 31 SAPAP3 -/and 30 WT.

Results
C57BL/6 mice readily acquired the response requirements in the SA task All C57BL/6 acquired the SA task and completed the final extinction test ( Fig. 2A). During the initial training stages C57BL/6 required 2-10 sessions of Mag training Figure 2. Validation of the SA task in mice. A, Number of training sessions until reaching criteria (in Mag and nose-poke training; stages 1 and 2). Stage 2 consisted of two parts: nose-poke training with 30 trials (NP 30) and nose-poke training with 50 trials (NP 50). B, Magazine entries of C57BL/6-SA mice declined across SA sessions (stage 3), demonstrating effective 'attenuation' of the feedback signal (i.e., attenuation of the association strength between signal and reward). C, In the final extinction test (stage 4), C57BL/6-SA mice (n = 12) showed significantly more UCTs in comparison with C57BL/6-RE mice (i.e., mice that underwent only RE in stage 4 without prior SA; n = 12). D, The mean number of UCT was significantly higher in SA mice in four out of five trial blocks. E, All animals increased the number of ENPs in the final extinction test. However, ENP of C57BL/6-SA mice were significantly higher compared with C57BL/6-RE mice in the final extinction test (stage 4). F, Examination of ENP in UCT showed increased compulsivity in C57BL/6-SA compared with C57BL/6-RE, demonstrating successful implementation of the SA task in mice. Mag, food magazine; NP, nose poke. Data are expressed as mean 6 SEM. *p , 0.05.
To investigate the within-session distribution of UCT we analyzed data of the final extinction test in blocks of 10 trials. Our results suggest that the SA produces consistent display of UCT as compared with RE (Fig. 2D). The number of CT was significantly higher in C57BL/6-RE mice (x 2 (1) = 17.35, p , 0.001), whereas C57BL/6-SA mice had more no-poke trials (x 2 (1) = 17.44, p , 0.001). Additionally, we recorded the number of ENPs. Similar to prior studies with rats, our results with mice confirmed that the SA procedure significantly increases the number of ENP (U = 28.5, p = 0.012; C57BL/6-SA: 46.1 6 5.6, C57BL/6-RE: 25.0 6 6.6; Fig. 2E). One of the most important markers of compulsive responding is the number of ENP in UCT. Our results reveal that the number of ENP in UCT was significantly higher in C57BL/6-SA compared with C57BL/6-RE (U = 19.0, p = 0.001; C57BL/6-SA: 16.2 6 3.2, C57BL/6-RE: 2.3 6 1.3; Fig. 2F). From these results we conclude that we successfully implemented the SA task for mice.

Attenuation of the feedback signal in SAPAP3 -/and WT was similar
Genotypes were matched on their performance during nose-poke training (stage 2), as we aimed to assess genotype differences during the extinction test only (stage 4). After nose-poke training, all mice were assigned to either the SA or the RE condition (within a genotype). Analyses confirmed that learning was not different between both conditions after this random assignment. WT-RE and WT-SA required a similar number of training sessions to reach task criteria in all stages of behavioral training (Fig. 3A). There was no difference between SAPAP3 -/--RE and SAPAP3 -/--SA in the number of training sessions until reaching task criteria (Fig. 3B).

Compulsive-like responding in WT-SA versus WT-RE
Analysis of compulsive responses of WT-SA (n = 17) and WT-RE (n = 13) revealed that the SA procedure induced significantly more UCT compared with RE (U = 60.5, p = 0.035; WT-SA: 6.4 6 1.1, WT-RE: 3.2 6 1.1;  Fig. 4B). This suggests that both markers of compulsive-like behavior, UCT and ENP-UCT, are increased following the SA procedure.

Compulsive responding of SAPAP3 -/versus WT
Comparison between SAPAP3 -/--SA and WT-SA as well as SAPAP3 -/--RE and WT-RE show similar numbers of UCT during the extinction test (x 2 (1) = 0.111, p = 0.739; Fig. 4A). Further analysis revealed that ENP-UCT were not different between genotypes (x 2 (1) = 0.872, p = 0.351; Fig.  4B). Taken together, these results suggest that SAPAP3 -/in comparison to WT controls do not show excessive, compulsive-like behavior in the SA paradigm, neither under conditions of RE nor when general feedback processing is diminished by SA.
Overall, a positive Spearman's rank-order correlation was observed between grooming of all animals and ENP in UCT (r (59) = 0.320, p = 0.014). Separated analysis for grooming of SAPAP3 -/and WT in the RE and SA condition was performed. This analysis showed a positive Spearman's rankorder correlation between the total duration of grooming and the number of UCT (r s(13) = 0.779, p = 0.001) of SAPAP3 -/--SA, but not WT-SA. Furthermore, a positive correlation between duration of grooming and ENP in UCT (r s(13) = 0.697, p = 0.004, Bonferroni corrected p = 0.0125) was observed in SAPAP3 -/--SA (Fig. 4E). In the RE condition, there was no significant correlation between duration of grooming and ENP in UCT following Bonferroni correction (p = 0.0125; Fig. 4F). Together, these results suggest a divergent grooming pattern between genotypes. In SAPAP3 -/-, grooming was associated with compulsive-like behavior following SA, which was not observed in WT.
Closer inspection of grooming behavior during task episodes of ITI, nose-poke light and feedback signal presentation revealed that grooming of SAPAP3 -/was specifically increased in the RE condition during cue light presentation (x 2 (1) = 6.694, p = 0.01; SAPAP3 -/-: 54.7 6 10.8 s, WT: 15.9 6 3.8 s) and ITI (x 2 (1) = 6.923, p = 0.009; SAPAP3 -/-: 136.1 6 23.6 s, WT: 60.1 6 9.2 s: Figure 3. Training performance of SAPAP3 -/and WT in the SA task. A, Number of training sessions until reaching criteria (in Mag and nose-poke training; stages 1 and 2). Stage 2 consisted of two parts: nose-poke training with 30 trials (NP 30) and nose-poke training with 50 trials (NP 50). WT-RE and WT-SA required a similar number of training sessions to reach task criteria. B, There was no difference between SAPAP3 -/--RE and SAPAP3 -/--SA in the number of training sessions until reaching task criteria. While SAPAP3 -/required a few more sessions of Mag and NP 30 training than WT, overall genotype comparison in SA and RE condition did not reveal significant differences, confirming that both genotypes acquired the task equally. C, Following NP 50 training, half of the WT and half of the SAPAP3 -/were subjected to the SA procedure (stage 3, SA) in which the information value of the signal is decreased by presenting the signal without food reward (simulation of feedback deficiency). In WT-SA, magazine entries declined across three successive SA sessions, demonstrating effective attenuation of the feedback signal. D, In SAPAP3 -/--SA, effective attenuation of the feedback signal was also demonstrated by decreasing magazine entries of across SA training sessions. No difference was observed between the number of magazine entries of WT and SAPAP3 -/-, suggesting that the association between signal and reward was attenuated effectively in both genotypes (SAPAP3 -/-: SA = 15, RE = 16; WT: SA = 17, RE = 13). Data are expressed as mean 6 SEM. pp , 0.05. Fig. 5A). Grooming during feedback signal presentation occurred only incidentally and was not different between genotypes. Grooming during nose-poke light presentation might limit attentional resources for task performance, but could also reflect decreased commitment to the task.
Additionally, 18 SAPAP3 -/and 18 WT were subjected to a 5-min test on the EPM. SAPAP3 -/and WT spent similar amounts of time exploring the open (U = 116.0, p = 0.146) and closed arms of the maze (U = 103.0, p = 0.062), suggesting that this cohort of young adult . SAPAP3 -/are not more compulsive than WT in the SA task. A, SAPAP3 -/do not show a general deficit in feedback processing, as compulsive responding during the extinction test (stage 4) of the SA paradigm was similar to that of normal WT controls. Both WT-SA and SAPAP3 -/--SA showed more UCTs than WT-RE and SAPAP3 -/--RE, respectively, confirming that the SA stage (stage 3) was effective in inducing compulsive-like responding. B, ENPs in UCT, an important indicator of compulsivity, was similar between SAPAP3 -/and WT mice that underwent SA, suggesting that genetic deletion of SAPAP3 does not potentiate SA-induced compulsivity. ENP in UCT were increased in WT-SA and SAPAP3 -/--SA compared with WT-RE and SAPAP3 -/--RE, respectively, confirming that the SA stage was effective. C, A genotype difference was found in the number of ENP in CTs. During RE, SAPAP3 -/showed reduced numbers of ENP in CT, indicative of rapid extinction learning or altered vigor for obtaining rewards. D, During the extinction test (stage 4), SAPAP3 -/--RE groomed significantly more than WT-RE. In contrast, grooming was similar between genotypes after SA. E, F, During the extinction test (stage 4), a positive correlation was observed between grooming and ENP in UCT for SAPAP3 -/--SA but not for SAPAP3 -/--RE. WT mice showed a positive correlation between grooming and ENP in UCT in the regularextinction condition but not following SA. The average duration of the extinction test varied between 39 and 41 min (SAPAP3 -/-: SA =15, RE = 16; WT: SA = 17, RE = 13). Data are expressed as mean 6 SEM. pp , 0.05.
Research Article: New Research SAPAP3 -/did not yet develop increased anxiety symptoms. During this test, SAPAP3 -/were observed with exaggerated grooming compared with WT (U = 96.0, p = 0.036; SAPAP3 -/-: 8.9 6 2.4 s; WT: 2.6 6 .62 s; Fig.  5C). Overall, grooming behavior of SAPAP3 -/and WT was likely influenced by genotype, task conditions and environmental requirements, as SAPAP3 -/groomed consistently more than WT, but grooming measures in various contexts were not correlated.

Discussion
Here, we investigated whether the compulsive phenotype observed in a mouse model for OCD, the compulsively grooming SAPAP3 -/- (Welch et al., 2007), is associated with a deficit in feedback processing. First, we implemented the SA task in mice, previously established for rats only. Our results show that mice behaved similarly to rats after being subjected to simulation of feedback deficiency (SA): compared with mice that experienced a "regular" extinction, they displayed more UCTs and, importantly, more ENPs, specifically during UCTs (ENP in UCT). The latter is used as a direct measure of compulsive responding in the SA task (Joel, 2006). Herewith, we validate the SA task to study deficient feedback processing in mice.
Next, we compared the behavior of SAPAP3 -/and their WT littermates in the SA task. SAPAP3 -/did not display more UCT or ENP-UCT after SA-induced feedback deficiency than WT. Thus, contrary to our initial hypothesis, SAPAP3 -/compulsive responding was not enhanced compared with WT, suggesting that different mechanisms underlie compulsive grooming and compulsive responding in the SA task.

SA in mice and rats
To understand compulsivity in the SA task, it is important to consider how different measures of compulsivity are related. In the SA stage (stage 3), the signal is presented without reward and nose-poke holes are inaccessible, thereby reducing the association between signal and reward, presumably without affecting response-outcome associations. By keeping response-outcome associations intact, it is possible to investigate the effect of 'distorted' feedback-processing on behavior. We focused our analysis on the most prominent indicators of compulsive behavior, as described by previous SA studies: UCT (correct responses not followed by attempts to collect reward) and ENP in UCT (perseveration of nose pokes in trials with correct responses not followed by attempts to collect reward). Response perseveration (ENP) is independent of trial structure because ENP are performed before attempting (or not attempting) reward collection. Thus, with more UCT there are not necessarily more ENP. We focused on ENP in UCT, instead of reporting proportions of ENP per UCT, because the relation of Figure 5. SAPAP3 -/groom significantly more than WT in various contexts. A, During the extinction test (stage 4), grooming is significantly higher in SAPAP3 -/compared with WT in the RE condition, both during the ITI and during illumination of the cue light (indicator that an operant response is required). However, grooming during cue light presentation was not correlated with overall task performance (data not shown), suggesting that attention to behaviorally relevant cues was not diminished in SAPAP3 -/-. B, Additionally, grooming was scored in the open field (OF). Results show increased grooming behavior of SAPAP3 -/compared with WT. C, SAPAP3 -/and WT were exposed to a 5-min test on the elevated plus maze (EPM). SAPAP3 -/spent more time grooming than WT mice on the EPM; RE, average duration extinction test 39-41 min; OF, test duration 60 min; EPM, test duration 5 min (SAPAP3 -/-: SA = 15, RE = 16; WT: SA = 17, RE = 13). Data are expressed as mean 6 SEM. pp , 0.05. these essentially independent outcome measures does not necessarily inform about an animal's degree of compulsivity. For example, when comparing an animal with one ENP in one UCT and an animal performing ten ENP in ten UCT, the latter one clearly displays more compulsive behavior revealed by the absolute numbers.
Generally, the measures for compulsivity in the SA task (UCT and ENP in UCT) likely represent behavior that is driven by behavioral uncertainty about current signal-outcome associations. In the SA condition, subjects learn that response feedback no longer signals reward availability (induction of feedback deficiency). When they subsequently experience the absence of reward after a CT in the final extinction test, they may experience the impulse to respond repeatedly when given the opportunity to perform this previously reinforced operant response.
We can conclude from experiment 1 that the SA condition induces compulsive responding in "normal" mice, marked by a significant increase of UCT and ENP-UCT, which is similar to effects generally observed in the rat model of SA (Joel, 2006;Albelda and Joel, 2012). We noticed that mice required more training sessions than rats to acquire responding for food, possibly reflecting differences in the species' learning abilities or effects of different response manipulanda (i.e., we used nose-poke holes whereas rats are usually trained with levers). However, performance of rats and mice in the final extinction test are comparable regarding the numbers of CT, no-poke trials, UCT, and ENP in UCT. Together, these findings across species provide further support for the hypothesis that relevant feedback cues are regulators of behavior and that attenuation of the incentive salience of these signals may cause difficulty in preventing behavior from becoming compulsive.
Comparison of SAPAP3 -/and WT in the SA task In experiment 2, we trained SAPAP3 -/and WT in the SA task to determine whether SAPAP3 -/show enhanced compulsivity compared with WT in the final extinction test. Importantly, our results demonstrate that the number of UCT and ENP-UCT were similarly increased in SAPAP3 -/and WT that underwent the SA procedure compared with SAPAP3 -/and WT that experienced RE. The absence of genotype differences in these compulsivity measures does not fulfill our prediction of potentiated compulsive responding of SAPAP3 -/under conditions of decreased feedback processing or during RE. Nonetheless, this finding provides important insight into the nature of compulsivity as it implies that different neurobiological mechanisms might independently lead to different aspects of compulsivity.
Comparison of grooming and compulsive responding in the SA task SAPAP3 -/exhibited enhanced grooming throughout the final extinction test compared with WT, confirming the exaggerated grooming phenotype of the SAPAP3 -/model (Welch et al., 2007). However, this was due to the large differences in grooming in the RE condition, which suggests an interaction between grooming and task-related behavior in both genotypes. Grooming in SAPAP3 -/was consistently increased over WT values in all provided experimental environments (operant box during SA paradigm, OF, EPM). Similar to previous studies, grooming scores in different task conditions, such as the SA task, OF, and EPM, were uncorrelated, suggesting that the degree to which individual SAPAP3 -/display this compulsive-like behavior is highly variable (Pinhal et al., 2018;Manning et al., 2019;van den Boom et al., 2019). Consistently, grooming is indeed known to be influenced strongly by emotional factors and environmental conditions (Kalueff et al., 2016).
A remarkable finding in this respect was that the extinction test (in SA-exposed mice) was the only condition in which grooming did not differ between SAPAP3 -/and WT. This was accompanied by a positive correlation between grooming and task-induced compulsivity measures for SAPAP3 -/in the SA condition. This indicates that the impact of this specific task phase on spontaneous (grooming) behavior is dependent on both task condition (SA or RE) and genotype. This is in contrast to the task-induced compulsive responding which exclusively depends on the task condition and is in conflict with the strong genotype dependence of spontaneous grooming in all other test conditions. Therefore, we do not take this as evidence for the existence of a relation between both forms of compulsivity and conclude that the SAPAP3 -/genotype resulting in compulsive grooming does not result in an increased susceptibility for compulsivity induced by feedback uncertainty.

Comparison with tests using feedback signals in OCD
The results of our study are important in light of the findings of a previous study examining the course of feedbackdependent learning in human OCD patients. During initial trial-and-error learning, in which behavioral responses needed to be updated by external feedback, OCD patients were observed to exhibit response deficits (Nielen et al., 2009). At a later point in training, however, OCD patients showed similar performance to controls. Thus, potentially decreased employment of external feedback signals might cause only transient learning deficits, that are less important for behavioral outcomes than OCD-related compulsive symptoms and could reflect decreased task engagement caused by altered processing of appetitive rewards, a condition that was also reported for human OCD patients (Figee et al., 2011;Marsh et al., 2015). Although the study by Nielen et al. (2009) provides valuable insight into feedback processing in patients, the exact point at which external feedback becomes less important for behavioral choices cannot be determined with this paradigm, nor is external feedback directly modulated. Future studies may employ the methodology of the SA task to further investigate processing of feedback signals related to obsessivecompulsive behaviors in OCD patients.

Cognitive dysfunction and compulsivity in SAPAP3 -/-
The question that follows from our findings is whether our conclusion that different neurobiological mechanisms may independently lead to different aspects of compulsivity (and ultimately to a compulsive-like phenotype; Figee et al., 2013), also applies to other cognitive dysfunctions that have been proposed to be associated with compulsive behaviors. Evidence is accumulating that cognitive inflexibility and an imbalance between goal-directed and habitual behavior may be present in the SAPAP3 -/model for OCD van den Boom et al., 2019;Ehmer et al., 2020). The relation between the dysfunction, compulsive grooming, and the genetic deletion, however, seems to be complex: neither cognitive inflexibility nor deficient habit formation were correlated to compulsive grooming. Taken together, this implies that the genetic defect may result in excessive grooming, impaired reversal learning, and deficient habit formation (in appetitive learning), but that these effects are not obviously linked to one another and possibly involve more complex or different neurobiological mechanisms.

Limitations
While the SA paradigm provides information about mechanisms of general feedback processing, internal feedback specifically related to compulsive behavior has not yet been investigated. It can thus not be excluded that different task conditions have an effect on a potential deficiency in processing of internal feedback signals related to compulsive grooming in SAPAP3 -/-. Finally, our study comprised only positive response feedback for an appetitive learning condition. Future investigation on processing of aversive feedback signals, or processing of feedback signals with positive or negative valence on avoidance behaviors is indicated.
In summary, the SA task simulates feedback deficiency, which is hypothesized to contribute to compulsive symptoms in psychiatric disorders such as OCD. The purpose of the current study was to implement the SA task in mice and to evaluate the SA-induced behavior of SAPAP3 -/compared with their normal littermates. The performance of SAPAP3 -/mutants in the SA task did not show a potentiation of compulsivity, suggesting that these two models of compulsivity do not share a common neuronal pathology and that different types of compulsivity exist in parallel.