Contributions of the basal ganglia to action sequence learning and performance

https://doi.org/10.1016/j.neubiorev.2019.09.017Get rights and content

Highlights

  • Reinforcement learning models can explain how sequences of actions are learned.

  • The basal ganglia are commonly identified as a nexus of reinforcement learning.

  • It has been suggested that action sequences are hierarchically structured.

  • The circuits that promote sequence initiation and execution are distinct.

Abstract

Animals engage in intricately woven and choreographed action sequences that are constructed from trial-and-error learning. The mechanisms by which the brain links together individual actions which are later recalled as fluid chains of behavior are not fully understood, but there is broad consensus that the basal ganglia play a crucial role in this process. This paper presents a comprehensive review of the role of the basal ganglia in action sequencing, with a focus on whether the computational framework of reinforcement learning can capture key behavioral features of sequencing and the neural mechanisms that underlie them. While a simple neurocomputational model of reinforcement learning can capture key features of action sequence learning, this model is not sufficient to capture goal-directed control of sequences or their hierarchical representation. The hierarchical structure of action sequences, in particular, poses a challenge for building better models of action sequencing, and it is in this regard that further investigations into basal ganglia information processing may be informative.

Introduction

The ability to learn and execute a complex series of movements in a particular order is a remarkable feat of any nervous system. Organisms are able to learn arbitrary sequences of actions through repeated practice, and once asymptotic performance has been reached, sequences can be recalled with impressive speed and precision. While much of the neuroscience of instrumental learning focuses on a single action that is reinforced or punished by an outcome, that single action is almost always part of a sequence of actions that are either not recorded or ignored by the experimenter. It is rarely the case that the sequence of events that constitute instrumental learning is action → outcome, as simple as it may be to depict it in that way. Often times the action that is to be reinforced or punished (i.e. the ‘target action’) is separated in time from an outcome by other actions, and the target action is always preceded by other actions. Moreover, the target action itself is usually composed of a series of muscle movements that follow a precise order. While there is much that is not known about the neural mechanisms of action sequence learning and performance in animals, there is nevertheless a rich literature that, when considered in full, narrows down the basal ganglia as a primary neural circuit involved in action sequencing. The aim of this review is to succinctly summarize that vast literature and filter out the main themes and unanswered questions across many studies.

In reviewing the literature on action sequencing, one finds a broad and loose definition of the term ‘action sequence’. Sometimes studies of action sequencing focus on different action types that are executed across different spatial locations (e.g. pressing a lever followed by pulling a chain), or actions of the same type that are executed across different spatial locations (e.g. pressing one lever followed by pressing another lever), or actions of the same type that are executed in the same spatial location (e.g. pressing the same lever repeatedly). Furthermore, the number of actions available to the subject at any given point in time varies from study to study, and the ‘degrees of freedom’ of action will determine the difficulty of the task and thus the rate of learning. For example, an animal that is required to execute a set of actions in a specific order to obtain a reward will face a more challenging learning scenario if all of those actions are simultaneously available compared to a situation in which only one action is available at a time, or when each action is cued by an external stimulus. Action sequencing is a broadly defined phenomenon, and because it is not monolithic, any conclusions that are drawn from a single study may not generalize to other situations. It is important to bear this caveat in mind.

To start, I begin by reviewing one of the most dominant accounts of instrumental learning in behavioral neuroscience today: reinforcement learning (RL). RL dictates that actions are learned ‘in reverse’ from the moment of reinforcement, and this can help to explain some empirical phenomena in the animal learning literature. The mechanisms that allow for this type of learning are fairly simple, and fall under the heading of ‘model-free’ RL. Model-free RL does not capture some of the more complex aspects of action sequencing, and as an alternative another brand of RL (‘model-based’ RL) is sometimes appealed to. But even with model-based RL there are shortcomings, since it does not capture the hierarchical structure of action sequences. I then review how theoretical models of the basal ganglia aim to map RL variables to neural function, while also pointing out crucial shortcomings. Finally, the various components of sequence performance—initiation, execution, and termination—are reviewed with reference to their associated basal ganglia correlates.

Section snippets

Reinforcement learning as a guiding framework

One of the most influential accounts of action sequence learning comes from the computational theory of reinforcement learning (RL). RL is a collection of algorithms that describes how animals and artificial agents can maximize long-term reward through trial-and-error learning (Sutton and Barto, 2018). As an animal explores its environment and encounters appetitive and aversive outcomes, it needs a way of figuring out what specific actions or action sequences led to those outcomes so that it

Contributions of the basal ganglia to action sequence learning

I next turn to an overview of the role of the basal ganglia in action sequence learning and performance. The basal ganglia are a set of evolutionary conserved subcortical nuclei that consist of a set of cortico-striato-pallido-thalamo-cortical loops, and the striatum in particular has been implicated in the control of movement and movement disorders for decades (Kish et al., 1988; Marsden, 1980). Over time, though, it has become clear that the basal ganglia also play a role in the learning of

Contributions of the basal ganglia to action sequence performance

Once action sequences have been learned they can be performed with astonishing fluidity. What are the neural mechanisms that enable the fluid performance of action sequences? In the following sections, I will review research that has illuminated how the basal ganglia circuitry controls the initiation, execution, and termination of action sequences.

Conclusions

The study of action sequence learning and performance provides an excellent opportunity to test computational and neural models of basal ganglia function. Only recently has it been possible to study and manipulate different striatal cell types in vivo, and this avenue of research has proven to be fruitful in revealing the differential contributions of the direct and indirect pathways to different facets of action sequencing. However, technological advances need to be matched by better

Acknowledgements

I thank Andrew Delamater and Jon Horvitz for extensive discussion and comments on an earlier version of this paper.

References (195)

  • B.J. De Corte et al.

    Striatal dopamine and the temporal control of behavior

    Behav. Brain Res.

    (2019)
  • T.M. Desrochers et al.

    Habit learning by naive macaques is marked by response sharpening of striatal neurons representing the cost and outcome of acquired action sequences

    Neuron

    (2015)
  • M.S. Fee

    The role of efference copy in striatal learning

    Curr. Opin. Neurobiol.

    (2014)
  • C.M. Gillan et al.

    The role of habit in compulsivity

    Eur. Neuropsychopharmacol.

    (2016)
  • A.M. Graybiel

    The basal ganglia and chunking of action repertoires

    Neurobiol. Learn. Mem.

    (1998)
  • S. Grillner et al.

    The basal ganglia downstream control of brainstem motor centres—an evolutionarily conserved strategy

    Curr. Opin. Neurobiol.

    (2015)
  • J.C. Horvitz

    Stimulus-response and response-outcome learning mechanisms in the striatum

    Behav. Brain Res.

    (2009)
  • X. Jin et al.

    Shaping action sequences in basal ganglia circuits

    Curr. Opin. Neurobiol.

    (2015)
  • D. Joel et al.

    Actor-critic models of the basal ganglia: new anatommical and computational perspective

    Neural Netw.

    (2002)
  • S.E. Ahmari et al.

    Repeated cortico-striatal stimulation generates persistent OCD-like behavior

    Science

    (2013)
  • T. Akam et al.

    Single-trial inhibition of anterior cingulate disrupts model-based reinforcement learning in a two-step decision task. Abstract: introduction

    BioRxiv

    (2017)
  • K.R. Bailey et al.

    Effects of frontal cortex lesions on action sequence learning in the rat

    Eur. J. Neurosci.

    (2007)
  • B.W. Balleine et al.

    Effects of outcome devaluation on the performance of a heterogeneous instrumental chain

    Int. J. Comp. Psychol.

    (2005)
  • B. Balleine et al.

    Instrumental performance following reinforcer devaluation depends upon incentive learning

    Q. J. Exp. Psychol.

    (1991)
  • T.D. Barnes et al.

    Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories

    Nature

    (2005)
  • L.P. Behmer et al.

    The dynamic range of response set activation during action sequencing

    J. Exp. Psychol. Hum. Percept. Perform.

    (2017)
  • J.D. Berke

    What does dopamine mean?

    Nat. Neurosci.

    (2018)
  • J. Bertran-Gonzalez et al.

    What is the degree of segregation between striatonigral and striatopallidal projections?

    Front. Neuroanat.

    (2010)
  • E. Burguière et al.

    Optogenetic stimulation of lateral orbitofronto-striatal pathway suppresses compulsive behaviors

    Science

    (2013)
  • P. Calabresi et al.

    Direct and indirect pathways of basal ganglia: a critical reappraisal

    Nat. Neurosci.

    (2014)
  • P. Calabresi et al.

    Hyperkinetic disorders and loss of synaptic downscaling

    Nat. Neurosci.

    (2016)
  • A.C. Catania

    Reinforcement schedules: the role of responses preceding the one that produces the reinforcer

    J. Exp. Anal. Behav.

    (1971)
  • C.Y. Chang et al.

    Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors

    Nat. Neurosci.

    (2016)
  • S. Chen et al.

    Knowledge of the ordinal position of list items in Rhesus monkeys

    Psychol. Sci.

    (1997)
  • A.G.E. Collins et al.

    Opponent actor learning (OpAL): modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive

    Psychol. Rev.

    (2014)
  • A.L. Collins et al.

    Dynamic mesolimbic dopamine signaling during action sequence learning and expectation violation

    Sci. Rep.

    (2016)
  • L.H. Corbit et al.

    Instrumental and Pavlovian incentive processes have dissociable effects on components of a heterogeneous instrumental chain

    J. Exp. Psychol. Anim. Behav. Process.

    (2003)
  • H.C. Cromwell et al.

    Implementation of action sequences by a neostriatal site: a lesion mapping study of grooming syntax

    J. Neurosci.

    (1996)
  • G. Cui et al.

    Concurrent activation of striatal direct and indirect pathways during action initiation

    Nature

    (2013)
  • J.A. Da Silva et al.

    Dopamine neuron activity before action initiation gates and invigorates future movements

    Nature

    (2018)
  • N.D. Daw et al.

    Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control

    Nat. Neurosci.

    (2005)
  • A.L. Derusso et al.

    Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement

    Front. Integr. Neurosci.

    (2010)
  • A. Dezfouli et al.

    Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized

    PLoS Comput. Biol.

    (2013)
  • A. Dezfouli et al.

    Habits, action sequences, and reinforcement learning

    Eur. J. Neurosci.

    (2012)
  • A. Dezfouli et al.

    Habits as action sequences: hierarchical action control and changes in outcome value

    Philos. Trans. R. Soc. B

    (2014)
  • E. Díaz-Hernández et al.

    The thalamostriatal projections contribute to the initiation and execution of a sequence of movements

    Neuron

    (2018)
  • A. Dickinson

    Instrumental conditioning

  • A. Dickinson et al.

    Contingency effects with maintained instrumental reinforcement

    Q. J. Exp. Psychol. B

    (1985)
  • A. Dickinson et al.

    The effect of the instrumental training contingency on susceptibility to reinforcer devaluation

    Q. J. Exp. Psychol. B

    (1983)
  • B.B. Doll et al.

    Model-based choices involve prospective neural activity

    Nat. Neurosci.

    (2015)
  • Cited by (14)

    • Grasping and Manipulation: Neural Bases and Anatomical Circuitry in Humans

      2021, Neuroscience
      Citation Excerpt :

      The role of basal ganglia could be that of maintaining active the spatiotemporal representation of the hand, which can guide object manipulation, in synergy with the activity of cortical areas. Since basal ganglia are also known to play a role, together with premotor areas, in controlling movement sequences (Lehéricy et al., 2005; Garr, 2019), it is plausible that this role is also exerted during tasks requiring a sequence of manipulative movements. Object manipulation also involves the cerebellum (Milner et al., 2007; Errante and Fogassi, 2019) (Fig. 1F).

    • Chemogenetic inhibition in the dorsal striatum reveals regional specificity of direct and indirect pathway control of action sequencing

      2020, Neurobiology of Learning and Memory
      Citation Excerpt :

      Specifically, DLS lesions led to a high rate of perseveration on the right lever and, consequently, a slow rate of learning the left-right sequence, while mice with dorsomedial striatum (DMS) lesions showed a normal rate of learning. The perseveration on the right lever is likely an exacerbation of the widely observed phenomenon of learning distal actions at a slower rate than proximal actions relative to the timing of reward (Garr, 2019). However, mice with DLS lesions, but not DMS lesions, also showed prolonged latencies between consecutive lever presses (maximum average of 60 s), which raises the possibility that the learning deficit was caused by a performance deficit.

    View all citing articles on Scopus
    View full text