A selectionist account of de novo action learning

https://doi.org/10.1016/j.conb.2011.05.004Get rights and content

How are novel actions generated and learned? We introduce a selectionist view of de novo action learning, and present some of the main postulates of such a view. This view contrasts with the notion that all actions are generated in response to particular stimuli and hence instructed by the world. It postulates that actions are generated in the actor (the organism) and selected by the environment (stimuli). Selection may occur iteratively until actions can be executed more rapidly and precisely, less variably, and eventually be elicited by particular stimuli. We also discuss experiments that support the particular predictions of this theory.

Highlights

► The main postulates underlying a selectionist view of de novo action learning are discussed. ► Diverse neural activity patterns are generated in the brain, and dopamine is related to the generation of this diversity. ► Variability in neural activity decreases as actions are automatized; this is accompanied by plasticity in cortico-basal ganglia circuits. ► Automatization of actions renders performance of the action less dependent on the mechanisms of generation of variability. ► Training changes why actions are performed.

Introduction

We and other animals have the amazing and unique capacity of moving around and behaving in an environment, change environments, and develop sophisticated action skills to communicate and seek resources. Many of the actions we perform are innate, that is they are genetically determined and pre-wired, and although they may be modulated by experience, they are essentially fixed action patterns controlled by either central pattern generators or stimulus–response circuits like reflexes (S–R, or input–output) [1, 2, 3, 4]. These types of circuits typically have strong connections between neurons (neurons may even be electrically coupled) [5, 6], and once activated they can produce a whole sequence of movements rather reliably and reproducibly [7, 8, 9]. But how does one generate novel actions, actions that were never generated before, and may never be generated again? The first time we produce a particular complex movement, especially when we start trying things early in development (motor babbling), these movements are unlikely to be generated by reflexive circuits or triggered by particular stimulus–response associations [3, 4]. One view is that neural circuits, and in particular cortical-basal ganglia circuits (which have neurons and subcircuits with the ability to be pacemakers), can be active generators of neural activity patterns [10, 11], which can be intrinsically variable or made variable by modulation; and generation of novel neural activity patterns could be the basis to produce novel actions [12•, 13••, 14••, 15]. Novel situations or environments could enhance the generation of novel patterns of activity and hence novel actions, which might then be selected by stimuli, as movements are refined and consolidated with practice [12•, 16]. This view contrasts dramatically with the view that all actions are responses to particular stimuli (stimulus–response) and hence instructed by the outside world. It defends that actions are generated in the actor, the organism, and selected by the environment (stimuli). Reiterations of this trial and selection process may occur until actions can be executed rapidly and precisely, less variably, and eventually be elicited by particular stimuli or situations (stimulus–response).

This view leads to several postulations. One is that the mechanisms for generating novel actions should also be mechanisms that increase the variability of activity patterns in the brain. A second one is that as we try novel actions variability in neural activity between different trials or executions should be high, and that as we improve how we do an action, and automatize its execution, this variability in neural activity should decrease and there should be crystallization or consolidation of particular activity patterns. Another one could be that that this decrease in motor variability is accomplished by plastic changes in the brain that select specific patterns with learning, generating a circuit akin to a reflex circuit. The generation of a reflex-like circuit should render the performance of the action less dependent on the mechanisms necessary for generating novel actions (and possibly even inhibit the generation of variability). However, it is important to consider that even after we crystallize actions into precise sequences of movements through changes in synaptic strength that permit the automatic execution of a motor sequence, the initiation and termination of the particular motor sequence need to still be, in many cases, self-determined by the organism. Finally, selecting particular reflex-like fixed patterns should lead to changes in the process of why organisms perform an action. Here we will review experimental evidence supporting each of these different postulates.

Section snippets

Generating diverse actions

How does the generation of novel actions occur? How is action diversity generated? One possibility is that self-generated novel actions are the result of diverse neural activity patterns generated de novo in the brain. Cortico-basal ganglia circuits have neurons and subcircuits that are spontaneously active [10, 11]. The globus pallidus-subthalamic nucleus circuit, for example, can be a pacemaker [11], and ensembles of striatal projection neurons can be spontaneously active together [17]. The

Decrease in neural activity variability as actions are automatized

As we postulated, the view that there is active generation of novel action patterns (and structures generating them), which are then selected and organized by the interaction with the world (feedback or error), has several predictions. One is that, as we just have discussed, the mechanisms for generating novel movements should also be mechanisms that increase the variability of activity patterns in the brain. A second one is that as we improve how we do an action, and automatize its execution,

Crystallization of neuronal activity patterns during skill consolidation is accompanied by plasticity in cortico-basal ganglia loops

What are the mechanisms by which neural activity becomes less variable as training occurs and movement becomes less dopamine-dependent? We discussed the possibility that dopaminergic neurons are less active after training. However, this does not mechanistically explain why automatized movements are less dopamine-dependent. Another possibility is that the decrease in variability and selection of activity patterns that occurs during skill consolidation is accomplished via plasticity in

Self-paced execution of automatized actions

Even considering that we crystallize actions into precise sequences of movements through changes in synaptic strength that permit the automatic execution of the motor sequence, the initiation and termination of a particular motor sequence needs to still be, in many cases, self-determined by the organism. Indeed, although Parkinson's (and Huntington's) patients can execute well-learned movements, they still display difficulties in initiating and terminating them [43, 44, 45]. Previous studies

Changing why a learned action is performed

The automatization of a particular movement or movement sequence permits the movement to be rapidly and reliably executed given a particular stimulus or state, controlled by a stimulus–response type circuit. However, this poses the problem of how does an organism do something novel again? What happens if the environment changes and the automatized skill no longer leads to the same outcome (or goal). There is increasing evidence that initially, as we learn to perform particular actions to obtain

Conclusion

We presented evidence for a selectionist theory of de novo action learning by reviewing experiments that support particular predictions of this theory. We discussed that dopamine, which is necessary for any novel action to be generated, rapidly modulates the variability of neural activity patterns in cortico-basal ganglia loops. Hyperdopaminergia results in less synchrony and more variable activity patterns in those circuits, as if dopamine would facilitate activity ‘jumping’ between different

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

We thank A. Damásio, A. Amorim, A. Coutinho, and S. Kushner for discussions on the concepts presented here, and members of the laboratory and collaborators for precious contributions over the years. Funding was provided by ERC STG-243393 to RMC.

References (63)

  • S. Grillner et al.

    On peripheral control mechanisms acting on the central pattern generators for swimming in the dogfish

    J Exp Biol

    (1982)
  • J.C. Smith et al.

    Pre-Botzinger complex: a brainstem region that may generate respiratory rhythm in mammals

    Science

    (1991)
  • D. Bucher et al.

    Animal-to-animal variability in motor pattern production in adults and during growth

    J Neurosci

    (2005)
  • A.A. Grace et al.

    The control of firing pattern in nigral dopamine neurons: single spike firing

    J Neurosci

    (1984)
  • D. Plenz et al.

    A basal ganglia pacemaker formed by the subthalamic nucleus and external globus pallidus

    Nature

    (1999)
  • B.P. Olveczky et al.

    Vocal experimentation in the juvenile songbird requires a basal ganglia circuit

    PLoS Biol

    (2005)
  • M.H. Kao et al.

    Contributions of an avian basal ganglia-forebrain circuit to real-time modulation of song

    Nature

    (2005)
  • R.M. Costa

    Plastic corticostriatal circuits for action learning: what's dopamine got to do with it?

    Ann N Y Acad Sci

    (2007)
  • G. Edelman

    Neural Darwinism. The Theory of Neuronal Group Selection

    (1987)
  • L. Carrillo-Reid et al.

    Encoding network states by striatal cell assemblies

    J Neurophysiol

    (2008)
  • O. Jaidar et al.

    Dynamics of the Parkinsonian striatal microcircuit: entrainment into a dominant network state

    J Neurosci

    (2010)
  • A. Carlsson

    Biochemical and pharmacological aspects of Parkinsonism

    Acta Neurol Scand Suppl

    (1972)
  • R.M. Costa et al.

    Rapid alterations in corticostriatal ensemble coordination during acute dopamine-dependent motor dysfunction

    Neuron

    (2006)
  • J.M. Burkhardt et al.

    Dissociable effects of dopamine on neuronal firing rate and synchrony in the dorsal striatum

    Front Integr Neurosci

    (2009)
  • P. Brown et al.

    Dopamine dependency of oscillations between subthalamic nucleus and pallidum in Parkinson's disease

    J Neurosci

    (2001)
  • I. Bar-Gad et al.

    Complex locking rather than complete cessation of neuronal activity in the globus pallidus of a 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine-treated primate in response to pallidal microstimulation

    J Neurosci

    (2004)
  • J.A. Goldberg et al.

    Enhanced synchrony among primary motor cortex neurons in the 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine primate model of Parkinson's disease

    J Neurosci

    (2002)
  • A. Raz et al.

    Activity of pallidal and striatal tonically active neurons is correlated in mptp-treated monkeys but not in normal monkeys

    J Neurosci

    (2001)
  • A.V. Cruz et al.

    Effects of dopamine depletion on network entropy in the external globus pallidus

    J Neurophysiol

    (2009)
  • S.D. Gale et al.

    A basal ganglia pathway drives selective auditory responses in songbird dopaminergic neurons via disinhibition

    J Neurosci

    (2010)
  • W. Schultz et al.

    A neural substrate of prediction and reward

    Science

    (1997)
  • Cited by (56)

    • Orbitofrontal cortex populations are differentially recruited to support actions

      2022, Current Biology
      Citation Excerpt :

      There is increasing evidence of lOFC disruption in psychiatric disorders characterized by disrupted action control, including substance use disorders and compulsive disorders,86,87,88,89 highlighting the need for a greater understanding of OFC’s contribution to the use of action-related information. Actions made during decision-making are often autonomous and unconstrained, occurring in contexts in which contingencies and associative structure of ongoing tasks are partially observable at best.3,4,36,90 While prior single-unit recordings from largely unclassified populations have shown lOFC neurons can reflect sensory, predictive, and outcome-related information,20,21,22,48 here we find that lOFC populations appear to be differentially recruited to support encoding of action-related information.

    View all citing articles on Scopus
    View full text