Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT

User menu

Search

  • Advanced search
eNeuro
eNeuro

Advanced Search

 

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT
PreviousNext
Research ArticleResearch Article: New Research, Cognition and Behavior

Reinforcement Learning during Locomotion

Jonathan M. Wood, Hyosub E. Kim and Susanne M. Morton
eNeuro 4 March 2024, 11 (3) ENEURO.0383-23.2024; https://doi.org/10.1523/ENEURO.0383-23.2024
Jonathan M. Wood
1Department of Physical Therapy, University of Delaware, Newark, Delaware 19713
2Interdisciplinary Graduate Program in Biomechanics & Movement Science, University of Delaware, Newark, Delaware 19713
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jonathan M. Wood
Hyosub E. Kim
1Department of Physical Therapy, University of Delaware, Newark, Delaware 19713
2Interdisciplinary Graduate Program in Biomechanics & Movement Science, University of Delaware, Newark, Delaware 19713
3Department of Psychological and Brain Sciences, University of Delaware, Newark, Delaware 19716
4School of Kinesiology, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Hyosub E. Kim
Susanne M. Morton
1Department of Physical Therapy, University of Delaware, Newark, Delaware 19713
2Interdisciplinary Graduate Program in Biomechanics & Movement Science, University of Delaware, Newark, Delaware 19713
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Susanne M. Morton
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Article Figures & Data

Figures

  • Figure 1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1.

    Experimental paradigm. A, All participants walked on a dual-belt treadmill, at a comfortable self-selected pace with a computer monitor in front of them. B, The RPE and TE groups received different feedback during the learning phase. The RPE group received only binary reward feedback, with a check mark and money added to a total when they performed a correct step length. The TE group received real-time feedback of their left step length related to the pink, horizontal target line. C, All participants walked in three different phases: (1) baseline, where individuals were asked to walk normally; (2) learning, where they gradually learned to walk with a longer left step length with the feedback and instructions depending on group assignment; (3) postlearning, where implicit aftereffects (experiment 1) or explicit retention (experiment 2) were probed, both without visual feedback. Participants in experiment 2 also were tested for explicit retention 24 h later (data not shown). Note that the target window in this figure was taken from a representative participant and is in terms of ΔLSL, but for all participants the target window was ±2 cm.

  • Figure 2.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2.

    Individual participant data. ΔLSL data for all strides of the learning phase from six exemplar subjects, three each from the TE and RPE groups. Participants in this figure were selected based on their magnitude of exploration (σΔLSL values) during early learning (i.e., the first 50 strides after the target stopped moving). Specifically, we selected individuals who represented the 10th, 50th, and 90th percentiles for each group separately according to our measure of early exploration. Blue dots represent unsuccessful steps, orange dots represent successful steps. The dashed lines represent the target window, centered on 10% ΔLSL (the width of the window was ±2 cm for all participants).

  • Figure 3.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 3.

    Experiment 1 learning and implicit aftereffects. A, Group average ΔLSL data shown over all strides of experiment 1. Solid colored lines represent group means and shading represents 1 SEM. Gray regions denote the times when no feedback was provided and individuals were asked to “walk normally”. The dashed line represents the center of the target. Rectangular boxes denote the key timepoints of early learning, late learning, and initial and early washout. B, Group averaged ΔLSL at late learning for both groups. Thick horizontal lines represent group means; dots represent individuals; error bars represent ±1 SEM. C, Group average ΔLSL error at early and late learning. Thick lines represent group means; thin lines represent individuals. D, Group average percent success at early and late learning. E, Group average implicit aftereffects. Thick horizontal lines represent group means; dots represent individuals; error bars represent ±1 SEM.

  • Figure 4.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 4.

    Experiment 1 exploration measurements. A, Exploration across the learning phase, measured as the baseline-normalized standard deviation of ΔLSL. For visualization purposes only, we calculated each individual's motor variability in 18 bins of 50 strides each across the learning phase, and then averaged motor variability among individuals in each group (solid lines) with the shading representing 1 SEM. The dashed rectangle represents the bins when the target was gradually shifting toward 10%. B, Early and late exploration. We calculated σΔLSL at early and late learning timepoints. Thick lines represent group means; thin lines represent individuals. C, Motor variability measured as the standard deviation of trial-to-trial changes after successful and unsuccessful steps (σtrial-to-trial). Thick lines represent group means; thin lines represent individuals.

  • Figure 5.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 5.

    Experiment 2 explicit retention. A, Group average ΔLSL data shown over all strides of experiment 2. Solid colored lines represent group means and shading represents 1 SEM. Gray regions denote the times when no feedback was provided. During retention testing, participants were instructed to “walk like you did at the end of the previous phase.” The dashed line represents the center of the target during the learning phase. B, Group average ΔLSL percent error data for each stride during the immediate and 24 h retention timepoints (rectangles represent the 25 strides of each epoch). 0 represents perfect retention. The inset shows the group average (horizontal lines) and individual (dots) retention levels for the two timepoints. Error bars represent 1 SEM. C, Group average percent retention data for each stride during the immediate and 24 h retention timepoints. All shown in the same manner as in B. The dashed line at 100% represents perfect retention.

Back to top

In this issue

eneuro: 11 (3)
eNeuro
Vol. 11, Issue 3
March 2024
  • Table of Contents
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this eNeuro article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Reinforcement Learning during Locomotion
(Your Name) has forwarded a page to you from eNeuro
(Your Name) thought you would be interested in this article in eNeuro.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Reinforcement Learning during Locomotion
Jonathan M. Wood, Hyosub E. Kim, Susanne M. Morton
eNeuro 4 March 2024, 11 (3) ENEURO.0383-23.2024; DOI: 10.1523/ENEURO.0383-23.2024

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Share
Reinforcement Learning during Locomotion
Jonathan M. Wood, Hyosub E. Kim, Susanne M. Morton
eNeuro 4 March 2024, 11 (3) ENEURO.0383-23.2024; DOI: 10.1523/ENEURO.0383-23.2024
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Conclusion
    • Footnotes
    • References
    • Synthesis
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • gait
  • motor learning
  • motor memory
  • reinforcement learning
  • reward
  • variability

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Article: New Research

  • Numbers of granule cells and GABAergic boutons are correlated in shrunken sclerotic hippocampi of sea lions with temporal lobe epilepsy
  • Breaching the blood-brain interface: Vasoactive neurons contact capillary vessels of the brain clock in the suprachiasmatic nucleus
  • Transcriptional Changes Fade Prior to Long-Term Memory for Sensitization of the Aplysia Siphon-Withdrawal Reflex.
Show more Research Article: New Research

Cognition and Behavior

  • Transcriptional Changes Fade Prior to Long-Term Memory for Sensitization of the Aplysia Siphon-Withdrawal Reflex.
  • Short-Term Perceptual Training Modulates Neural Responses to Deepfake Speech but Does Not Improve Behavioral Discrimination
  • Dynamic Encoding of Reward Prediction Error Signals in the Pigeon Ventral Tegmental Area during Reinforcement Learning
Show more Cognition and Behavior

Subjects

  • Cognition and Behavior
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Latest Articles
  • Issue Archive
  • Blog
  • Browse by Topic

Information

  • For Authors
  • For the Media

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Feedback
(eNeuro logo)
(SfN logo)

Copyright © 2026 by the Society for Neuroscience.
eNeuro eISSN: 2373-2822

The ideas and opinions expressed in eNeuro do not necessarily reflect those of SfN or the eNeuro Editorial Board. Publication of an advertisement or other product mention in eNeuro should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in eNeuro.