Contributions of the basal ganglia to action sequence learning and performance
Introduction
The ability to learn and execute a complex series of movements in a particular order is a remarkable feat of any nervous system. Organisms are able to learn arbitrary sequences of actions through repeated practice, and once asymptotic performance has been reached, sequences can be recalled with impressive speed and precision. While much of the neuroscience of instrumental learning focuses on a single action that is reinforced or punished by an outcome, that single action is almost always part of a sequence of actions that are either not recorded or ignored by the experimenter. It is rarely the case that the sequence of events that constitute instrumental learning is action → outcome, as simple as it may be to depict it in that way. Often times the action that is to be reinforced or punished (i.e. the ‘target action’) is separated in time from an outcome by other actions, and the target action is always preceded by other actions. Moreover, the target action itself is usually composed of a series of muscle movements that follow a precise order. While there is much that is not known about the neural mechanisms of action sequence learning and performance in animals, there is nevertheless a rich literature that, when considered in full, narrows down the basal ganglia as a primary neural circuit involved in action sequencing. The aim of this review is to succinctly summarize that vast literature and filter out the main themes and unanswered questions across many studies.
In reviewing the literature on action sequencing, one finds a broad and loose definition of the term ‘action sequence’. Sometimes studies of action sequencing focus on different action types that are executed across different spatial locations (e.g. pressing a lever followed by pulling a chain), or actions of the same type that are executed across different spatial locations (e.g. pressing one lever followed by pressing another lever), or actions of the same type that are executed in the same spatial location (e.g. pressing the same lever repeatedly). Furthermore, the number of actions available to the subject at any given point in time varies from study to study, and the ‘degrees of freedom’ of action will determine the difficulty of the task and thus the rate of learning. For example, an animal that is required to execute a set of actions in a specific order to obtain a reward will face a more challenging learning scenario if all of those actions are simultaneously available compared to a situation in which only one action is available at a time, or when each action is cued by an external stimulus. Action sequencing is a broadly defined phenomenon, and because it is not monolithic, any conclusions that are drawn from a single study may not generalize to other situations. It is important to bear this caveat in mind.
To start, I begin by reviewing one of the most dominant accounts of instrumental learning in behavioral neuroscience today: reinforcement learning (RL). RL dictates that actions are learned ‘in reverse’ from the moment of reinforcement, and this can help to explain some empirical phenomena in the animal learning literature. The mechanisms that allow for this type of learning are fairly simple, and fall under the heading of ‘model-free’ RL. Model-free RL does not capture some of the more complex aspects of action sequencing, and as an alternative another brand of RL (‘model-based’ RL) is sometimes appealed to. But even with model-based RL there are shortcomings, since it does not capture the hierarchical structure of action sequences. I then review how theoretical models of the basal ganglia aim to map RL variables to neural function, while also pointing out crucial shortcomings. Finally, the various components of sequence performance—initiation, execution, and termination—are reviewed with reference to their associated basal ganglia correlates.
Section snippets
Reinforcement learning as a guiding framework
One of the most influential accounts of action sequence learning comes from the computational theory of reinforcement learning (RL). RL is a collection of algorithms that describes how animals and artificial agents can maximize long-term reward through trial-and-error learning (Sutton and Barto, 2018). As an animal explores its environment and encounters appetitive and aversive outcomes, it needs a way of figuring out what specific actions or action sequences led to those outcomes so that it
Contributions of the basal ganglia to action sequence learning
I next turn to an overview of the role of the basal ganglia in action sequence learning and performance. The basal ganglia are a set of evolutionary conserved subcortical nuclei that consist of a set of cortico-striato-pallido-thalamo-cortical loops, and the striatum in particular has been implicated in the control of movement and movement disorders for decades (Kish et al., 1988; Marsden, 1980). Over time, though, it has become clear that the basal ganglia also play a role in the learning of
Contributions of the basal ganglia to action sequence performance
Once action sequences have been learned they can be performed with astonishing fluidity. What are the neural mechanisms that enable the fluid performance of action sequences? In the following sections, I will review research that has illuminated how the basal ganglia circuitry controls the initiation, execution, and termination of action sequences.
Conclusions
The study of action sequence learning and performance provides an excellent opportunity to test computational and neural models of basal ganglia function. Only recently has it been possible to study and manipulate different striatal cell types in vivo, and this avenue of research has proven to be fruitful in revealing the differential contributions of the direct and indirect pathways to different facets of action sequencing. However, technological advances need to be matched by better
Acknowledgements
I thank Andrew Delamater and Jon Horvitz for extensive discussion and comments on an earlier version of this paper.
References (195)
- et al.
Midbrain dopamine neurons encode a quantitative reward prediction error signal
Neuron
(2005) - et al.
Natural syntax rules control action sequences of rats
Behav. Brain Res.
(1987) - et al.
Multiplicity of control in the basal ganglia: computational roles of striatal subregions
Curr. Opin. Neurobiol.
(2011) Hierarchical models of behavior and prefrontal function
Trends Cogn. Sci.
(2008)- et al.
Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective
Cognition
(2009) Hierarchical reinforcement learning and decision making
Curr. Opin. Neurobiol.
(2012)- et al.
Dopamine-mediated regulation of corticostriatal synaptic plasticity
Trends Neurosci.
(2007) - et al.
Dopamine D2 receptors regulate the anatomical and functional balance of basal ganglia circuitry
Neuron
(2014) - et al.
Habitual alcohol seeking: time course and the contribution of subregions of the dorsal striatum
Biol. Psychiatry
(2012) - et al.
Model-based influences on humans’ choices and striatal prediction errors
Neuron
(2011)
Striatal dopamine and the temporal control of behavior
Behav. Brain Res.
Habit learning by naive macaques is marked by response sharpening of striatal neurons representing the cost and outcome of acquired action sequences
Neuron
The role of efference copy in striatal learning
Curr. Opin. Neurobiol.
The role of habit in compulsivity
Eur. Neuropsychopharmacol.
The basal ganglia and chunking of action repertoires
Neurobiol. Learn. Mem.
The basal ganglia downstream control of brainstem motor centres—an evolutionarily conserved strategy
Curr. Opin. Neurobiol.
Stimulus-response and response-outcome learning mechanisms in the striatum
Behav. Brain Res.
Shaping action sequences in basal ganglia circuits
Curr. Opin. Neurobiol.
Actor-critic models of the basal ganglia: new anatommical and computational perspective
Neural Netw.
Repeated cortico-striatal stimulation generates persistent OCD-like behavior
Science
Single-trial inhibition of anterior cingulate disrupts model-based reinforcement learning in a two-step decision task. Abstract: introduction
BioRxiv
Effects of frontal cortex lesions on action sequence learning in the rat
Eur. J. Neurosci.
Effects of outcome devaluation on the performance of a heterogeneous instrumental chain
Int. J. Comp. Psychol.
Instrumental performance following reinforcer devaluation depends upon incentive learning
Q. J. Exp. Psychol.
Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories
Nature
The dynamic range of response set activation during action sequencing
J. Exp. Psychol. Hum. Percept. Perform.
What does dopamine mean?
Nat. Neurosci.
What is the degree of segregation between striatonigral and striatopallidal projections?
Front. Neuroanat.
Optogenetic stimulation of lateral orbitofronto-striatal pathway suppresses compulsive behaviors
Science
Direct and indirect pathways of basal ganglia: a critical reappraisal
Nat. Neurosci.
Hyperkinetic disorders and loss of synaptic downscaling
Nat. Neurosci.
Reinforcement schedules: the role of responses preceding the one that produces the reinforcer
J. Exp. Anal. Behav.
Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors
Nat. Neurosci.
Knowledge of the ordinal position of list items in Rhesus monkeys
Psychol. Sci.
Opponent actor learning (OpAL): modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive
Psychol. Rev.
Dynamic mesolimbic dopamine signaling during action sequence learning and expectation violation
Sci. Rep.
Instrumental and Pavlovian incentive processes have dissociable effects on components of a heterogeneous instrumental chain
J. Exp. Psychol. Anim. Behav. Process.
Implementation of action sequences by a neostriatal site: a lesion mapping study of grooming syntax
J. Neurosci.
Concurrent activation of striatal direct and indirect pathways during action initiation
Nature
Dopamine neuron activity before action initiation gates and invigorates future movements
Nature
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
Nat. Neurosci.
Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement
Front. Integr. Neurosci.
Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized
PLoS Comput. Biol.
Habits, action sequences, and reinforcement learning
Eur. J. Neurosci.
Habits as action sequences: hierarchical action control and changes in outcome value
Philos. Trans. R. Soc. B
The thalamostriatal projections contribute to the initiation and execution of a sequence of movements
Neuron
Instrumental conditioning
Contingency effects with maintained instrumental reinforcement
Q. J. Exp. Psychol. B
The effect of the instrumental training contingency on susceptibility to reinforcer devaluation
Q. J. Exp. Psychol. B
Model-based choices involve prospective neural activity
Nat. Neurosci.
Cited by (14)
Grasping and Manipulation: Neural Bases and Anatomical Circuitry in Humans
2021, NeuroscienceCitation Excerpt :The role of basal ganglia could be that of maintaining active the spatiotemporal representation of the hand, which can guide object manipulation, in synergy with the activity of cortical areas. Since basal ganglia are also known to play a role, together with premotor areas, in controlling movement sequences (Lehéricy et al., 2005; Garr, 2019), it is plausible that this role is also exerted during tasks requiring a sequence of manipulative movements. Object manipulation also involves the cerebellum (Milner et al., 2007; Errante and Fogassi, 2019) (Fig. 1F).
Chemogenetic inhibition in the dorsal striatum reveals regional specificity of direct and indirect pathway control of action sequencing
2020, Neurobiology of Learning and MemoryCitation Excerpt :Specifically, DLS lesions led to a high rate of perseveration on the right lever and, consequently, a slow rate of learning the left-right sequence, while mice with dorsomedial striatum (DMS) lesions showed a normal rate of learning. The perseveration on the right lever is likely an exacerbation of the widely observed phenomenon of learning distal actions at a slower rate than proximal actions relative to the timing of reward (Garr, 2019). However, mice with DLS lesions, but not DMS lesions, also showed prolonged latencies between consecutive lever presses (maximum average of 60 s), which raises the possibility that the learning deficit was caused by a performance deficit.