Instrumental learning in a mouse model for obsessive-compulsive disorder: Impaired habit formation in Sapap3 mutants

https://doi.org/10.1016/j.nlm.2020.107162Get rights and content

Abstract

It has been hypothesized that maladaptive habit formation contributes to compulsivity in psychiatric disorders such as obsessive-compulsive disorder (OCD). Here, we used an established animal model of OCD, Sapap3 knockout mice (SAPAP3−/−), to investigate the balance of goal-directed and habitual behavior in compulsive individuals and if altered habit formation is associated with compulsive-like behavior.

We subjected 24 SAPAP3−/− and 24 wildtype littermates (WT) to two different schedules of reinforcement in a within-subjects design: a random-ratio (RR) schedule to promote goal-directedness, and a random-interval (RI) schedule, known to facilitate habitual responding. SAPAP3−/− acquired responding under both schedules, but showed lower response rates and fewer attempts to collect food pellets than WT, indicative of altered reward processing. As expected, WT were sensitive to sensory-specific satiety (outcome devaluation) following RR training, but not RI training, demonstrating schedule-specific acquisition of goal-directed and habitual responding, respectively. In contrast, SAPAP3−/− were sensitive to outcome devaluation after both RR and RI training, suggesting decreased engagement of a habitual response strategy. No linear relation was observed between increased grooming and behavior during the outcome devaluation test in SAPAP3−/−.

Together, our findings demonstrate altered reward processing and impaired habit learning in SAPAP3−/−. We report a diminished propensity to form habits in these mice, which albeit inconsistent with the predominant idea of excessive habit formation in OCD, nonetheless points at dysregulation of behavioral automation in the context of compulsivity. Thus, the habit hypothesis of compulsivity should be updated to state that an imbalance of habitual and goal-directed responding in either direction can contribute to the development of compulsive behavior.

Introduction

All behavior is eventually a unique compromise of actions based on intentions, knowledge, well-established routines, and environmental necessities. Behavior is guided by divergent response styles referred to as goal-directed or habitual response strategies. Employment of a goal-directed strategy is typically assumed when behavioral responses are sensitive to outcome value and to situational action-outcome contingencies (Balleine & O’Doherty, 2010). Habitual responding, on the other hand, is conceptualized as automated, inflexible behavior that is predominantly evoked by stimulus-response (S-R) associations with low levels of experienced contingency between an action and its consequences (Dickinson, 1985). While goal-directed behavior requires cognitive deliberation, habitual actions rely on repetition and generally demand less attention. However, optimal behavioral adaptation depends on flexible shifting between habitual and goal-directed response strategies.

It has been hypothesized that overreliance on habitual strategies could underlie compulsive behavior observed in several psychiatric disorders, including obsessive-compulsive disorder (OCD), as habits and compulsions share some similarity in the composition of their tightly structured actions (Fineberg et al., 2014, Gillan et al., 2011, Gremel and Costa, 2013). So far, only a small number of studies investigated appetitive habit formation and compulsivity (Gillan et al., 2011, Snorrason et al., 2016, Voon et al., 2015). These studies, showed a link between increased habitual responding and compulsive symptoms, but are limited by practical issues (pharmacological treatment of OCD patients, non-clinical study subjects) and by uncertainty whether the appetitive tasks, originally developed in animal studies, are valid paradigms to assess habits in human patients (Ceceli and Tricomi, 2018, Foerde, 2018, Watson and de Wit, 2018).

The relation between habit formation and compulsivity is understood insufficiently. Therefore, we set out to test habit formation in Sapap3 knockout mice (SAPAP3−/−), a validated genetic mouse model of OCD, that exhibits compulsive-like self-grooming (Burguiere et al., 2013, Pinhal et al., 2018, van den Boom et al., 2019, Welch et al., 2007, Yang and Lu, 2011). We tested these mice in an outcome devaluation paradigm, the most prominent method for exploring appetitive habit formation (Adams and Dickinson, 1981, Gremel and Costa, 2013, Gremel and Costa, 2013, Hilário et al., 2007). The outcome devaluation paradigm relies on the observation that different reinforcement schedules promote acquisition and selection of either goal-directed or habitual response strategies. These strategies can be discerned in an outcome devaluation test, in which the value of the reinforcer is diminished and sensitivity to this devaluation is probed in an extinction test (Gremel & Costa, 2013). Random ratio schedules (RR) bias towards goal-directed responses because they provide high action-outcome contingency, sensitivity towards changes in outcome value, and flexible adaptation of behavior DeRusso et al., 2010, Moore et al., 2009. Random-interval schedules (RI) on the other hand, are characterized by a weak correlation between actions and outcomes, causing an insensitivity to outcome devaluation, denominated as habitual response strategy (Dickinson, 1985).

Previous research with mice trained in the outcome devaluation paradigm showed that goal-directed and habitual strategies are mediated by cortico-striatal brain circuits (Gremel & Costa, 2013). These cortico-striatal circuits are compromised in OCD patients (Graybiel and Rauch, 2000, Saxena et al., 1998) and in SAPAP3−/− Burguiere et al., 2013, Pinhal et al., 2018, Welch et al., 2007, suggesting that this mouse model is well-suited to test hypotheses regarding the relationship between aberrant habit formation and compulsivity. Following the hypothesis that obsessions and compulsions in OCD patients are based on overactive neural circuits involved in habit formation (Graybiel & Rauch, 2000), we tested if SAPAP3−/− exhibit an imbalance in acquisition and expression of habitual and goal-directed action strategies. In addition, we explored if a positive relation exists between compulsive grooming (duration and/or frequency) during the outcome devaluation test and the imbalance between habitual and goal-directed response strategies. Such a relation would suggest a common neural substrate for regulation of habitual, goal-directed, and compulsive behavior.

Section snippets

Subjects

This study included 24 SAPAP3−/− (15 male, 9 female) and 24 of their normal wildtype littermates (WT) (12 male, 12 female), bred-in-house, aged 2–4 months at the start of the behavioral training, individually housed in standard controlled conditions with reversed day-night cycle (12:12 h light/dark; lights on at 8 PM). Training and tests were performed in the animal’s active period. All mice were food deprived to 85% of their individual ad libitum weight, while water intake was not restricted.

Grooming in the open field

At the beginning of the experiment, grooming and locomotion were assessed in an open field for 60 min. Automated scoring of grooming showed increased duration of grooming of SAPAP3−/− compared to WT mice (Duration WT: 120.1 ± 13.7 s; SAPAP3−/−: 213.7 ± 36.6 s; Mann-Whitney U test, U = 184; p = .032; Grooming bouts WT: 74.3 ± 7.6; SAPAP3−/−: 115.5 ± 13.0; U = 164; p = .011). Locomotion was similar between genotypes (Fig. 1A-C).

Habituation to the operant conditioning chamber and food pellets

During magazine training animals were habituated to the new

Discussion

To investigate whether a loss of SAPAP3 affects the balance of goal-directed and habitual behavior, we studied SAPAP3 knockout mice (SAPAP3−/−) and their WT littermates during acquisition of responding under instrumental reinforcement schedules biasing behavior towards either goal-directed or habitual behavior. Further, we explored if compulsive-like grooming of SAPAP3−/− was correlated with habit or goal-directed biases.

General task acquisition and basic performance was similar between

Author contribution

I.E., I.W., and M.F. designed research; I.E. performed research and analyzed data; I.E., I.W., M.F., and D.D. interpreted the data and wrote the paper. All authors reviewed the manuscript.

Acknowledgements

We thank Ralph Hamelink and Dr. Nicole Yee (Netherlands Institute for Neuroscience) for their technical assistance and Dr. Guoping Feng (Massachusetts Institute of Technology) for providing us with SAPAP3−/−.

Funding

This work was in part supported by the NWO VIDI grant to I.W. (864.14.010, 2015/06367/ALW) and by the ERC Starting Grant to I.W. (ERC-2014-STG 638013). This research did not receive any funding from agencies in the for-profit sector.

Declaration of Competing Interest

The authors declared that there is no conflict of interest.

References (41)

  • P. Watson et al.

    Current limits of experimental research into habits and future directions

    Current Opinion in Behavioral Sciences

    (2018)
  • D.W. Woods et al.

    Habit reversal: A review of applications and variations

    Journal of Behavior Therapy and Experimental Psychiatry

    (1995)
  • H.H. Yin et al.

    Inactivation of dorsolateral striatum enhances sensitivity to changes in the action-outcome contingency in instrumental conditioning

    Behavioural Brain Research

    (2006)
  • C.D. Adams et al.

    Instrumental responding following reinforcer devaluation

    The Quarterly Journal of Experimental Psychology Section B

    (1981)
  • B.W. Balleine et al.

    Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action

    Neuropsychopharmacology

    (2010)
  • E. Burguiere et al.

    Optogenetic stimulation of lateral orbitofronto-striatal pathway suppresses compulsive behaviors

    Science

    (2013)
  • V.L. Corbit et al.

    Strengthened inputs from secondary motor cortex to striatum in a mouse model of compulsive behavior

    Journal of Neuroscience

    (2019)
  • T. Deckersbach et al.

    A study of parallel implicit and explicit information processing in patients with obsessive-compulsive disorder

    American Journal of Psychiatry

    (2002)
  • D. Denys

    Obsessionality & compulsivity: A phenomenology of obsessive-compulsive disorder

    Philosophy, Ethics, and Humanities in Medicine

    (2011)
  • A.L. DeRusso et al.

    Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement

    Frontiers in Integrative Neuroscience

    (2010)
  • Cited by (20)

    • Valence processing alterations in SAPAP3 knockout mice and human OCD

      2022, Journal of Psychiatric Research
      Citation Excerpt :

      In regards to positive valence processing, SAPAP3 KO mice showed impairments in the acquisition of reward learning. Previous studies in SAPAP3 KO mice utilizing operant tasks have found mixed results with some studies reporting learning deficits while others do not (Ehmer et al., 2020a, 2020b; Hadjas et al., 2019; Manning et al., 2019). However, studies not reporting learning deficits differed in key variables from our study including in the age of the animal used and the phase of the light cycle in which the study was run which could critically contribute to differences in the results seen.

    • Cortico-basal ganglia circuits underlying dysfunctional control of motor behaviors in neuropsychiatric disorders

      2020, Current Opinion in Genetics and Development
      Citation Excerpt :

      Consistently, in a study whereby animals were trained on a task that promotes the formation of habits, Sapap3-ko mice showed levels of habitual behavior superior to wildtype controls [35]. However, in a second study where these mice were trained in a 2-context version such that they are exposed to training schedules known to promote habitual and goal-directed behavior, Sapap3-ko mice were goal-directed on both schedules while wildtype littermates remained goal-directed in one but formed a habit in the other [36]. The apparent discrepancies in results between these studies could be explained if one considers that OCD leads to deficits in the ability to shift between the goal-directed and habitual system, dependent on context [37].

    View all citing articles on Scopus
    1

    Contributed equally.

    View full text