Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT

User menu

Search

  • Advanced search
eNeuro
eNeuro

Advanced Search

 

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT
PreviousNext
Research ArticleResearch Article: New Research, Sensory and Motor Systems

A Mechanistic Model for Reward Prediction and Extinction Learning in the Fruit Fly

Magdalena Springer and Martin Paul Nawrot
eNeuro 30 March 2021, 8 (3) ENEURO.0549-20.2021; https://doi.org/10.1523/ENEURO.0549-20.2021
Magdalena Springer
Computational Systems Neuroscience, Institute of Zoology, University of Cologne, Biocenter, Cologne 50674, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Martin Paul Nawrot
Computational Systems Neuroscience, Institute of Zoology, University of Cologne, Biocenter, Cologne 50674, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Martin Paul Nawrot
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Article Figures & Data

Figures

  • Extended Data
  • Figure
    • Download figure
    • Open in new tab
    • Download powerpoint
  • Figure 1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1.

    Circuit model. The olfactory pathway comprises three feedforward layers. Olfactory input activates an odor-specific combination of PNs. The connectivity matrix from PNs to the large population of KCs is divergent-convergent and sparse. The inhibitory feedback mechanism ensures population sparseness of 5% active cells in the KC layer. KCs are fully connected with four MBONs in the output layer. A positive or negative reinforcement stimulus directly excites the dopaminergic PAM or PPL1 neuron, respectively. Lateral inhibition between the MBONs and excitatory feedback from MBONs to DANs is crucial for the reward prediction and extinction mechanisms. Activation of the MVP2 output neuron mediates approach behavior, activation of MV2 mediates avoidance behavior. Behavioral preference toward an odor is calculated as the imbalance between the activations of MVP2 and MV2.

  • Figure 2.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2.

    Differential conditioning leads to CS+-specific odor preference. A, Experimental protocols for simulation of associative conditioning experiments. The model was trained in a classical conditioning paradigm, in which a conditioned odor stimulus (CS+) was paired with a reward (green) or punishment (red). Subsequently, a second unconditioned odor stimulus (CS–) was presented without reinforcement. The standard protocol comprises 12 training trials. During the retention test, both odor stimuli were presented alone without reinforcement. Preference index and performance index in the model were computed after each trial and during the retention test. B, Dynamics of the odor preference index across 24 appetitive (top) and aversive (bottom) learning trials, respectively averaged across 10 networks. C, left, Combinatorial response patterns in the PN population. Each odor stimulus activates 50% of all PNs. Similarities between the CS+ reference odor and five novel odors are defined by their overlap in the PN activation pattern (x-axis). Right, Activation patterns across the subpopulation of 100 KCs (5%) activated by the CS+ odor. Pattern overlap in the KC population reduces rapidly with decreasing odor similarity as expressed in the percentage of PN pattern overlap (x-axis). D, Generalization to different odors after associative conditioning to the CS+ odor. The model shows a significant CS+ approach (green) or avoidance (red) after 12 training trials as expressed in the preference index (n = 15), which diminishes rapidly with decreasing odor similarity. Boxplots show the median and the lower and upper quartiles, whiskers indicate 1.5 times interquartile range, outliers are marked with + symbol.

  • Figure 3.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 3.

    CS+ odor reactivation leads to memory extinction. A, Training and test protocol as in Figure 2A. In the extinction protocol, the training phase is followed by an odor reactivation phase, in which the CS+ was presented alone for 12 consecutive trials before performing the retention test. B, The appetitive conditioning protocol (light gray) induced a strong and significant preference for the CS+ odor and a weak and significant preference for the CS– odor. Extinction learning partly abolished the CS+ preference during the retention test (dark gray). The preference index for the CS+ odor was significantly lower than after initial appetitive conditioning, for the CS– odor no more significant preference was observed after extinction learning. C, Symmetric results were observed for aversive conditioning. Again, extinction learning significantly reduced the learned preference for the CS+ odor. Boxplots show the median and the lower and upper quartiles, whiskers indicate 1.5 times interquartile range, outliers are marked with + symbol. D, E, The performance indices were significantly reduced after extinction compared with the initial appetitive and aversive memories. Data are presented mean ± SD; ***p < 0.001; n = 15 independent models. For preference and performance indices of the model tuned toward asymmetric appetitive and aversive learning pathways, see Extended Data Figure 3-2.

  • Figure 4.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 4.

    Neuron activation rates and behavioral performance across trials. A, Reward value (US, top) and DAN activation rates (bottom) across 12 initial conditioning trials (left) and 12 extinction trials (right). Trial zero indicates neuron activation rates in the naive model before learning. Reward presentation resulted in a prominent peak activation of the PAM during the very first training trial, i.e., when the actual reward strongly deviates from the predicted reward. Omission of the reward during the first extinction trial (right) resulted in a step-like reduction of PAM activation the simultaneous increase in PPL1 activation, which is crucial for extinction learning. B, Summed synaptic KC input to MBONs (top), MBON activation rates (middle), and model preference index (bottom) for each simulated test trial that followed the respective training trial (see Materials and Methods). The strong PAM activation during the first conditioning trial resulted in a strong change of the synaptic weights between KCs and MV2 and M6, and consequently in a step-like reduction of the synaptic input and the output activation rates. Consequently, the model preference index represents the imbalance between M6 and MV2 activation rates showed a step-like increase predicting a switch-like expression of a CR behavior after only a single conditioning trial. Additional training led to a further gradual reduction of gross synaptic input to and activation of MV2 and M6, paralleled by the gradual increase of the preference index. Extinction learning led to a gradual reduction of synaptic weights in the KC::MVP2 and KC::V2 pathway. This reduces the difference between M6 and MV2 activation and leads to a gradual extinction of the preference index. C, D, Same as A, B for aversive conditioning (left) and subsequent extinction (right). Simulation results are shown for a single network. This was initiated identically before appetitive (A, B) and aversive (C, D) conditioning to enforce identical initial conditions stressing the symmetric mechanism of reward prediction.

  • Figure 5.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 5.

    Two separate memories underlie initial and extinction learning. CS+ odor-induced activation rates of MBONs were measured before and after extinction training to locate associative and extinction memories. A, B, Appetitive learning led to a relative depression of CS+ odor-induced activation rates of MV2 and M6, but not MVP2 and V2 MBONs. The memory trace in MV2/M6 stayed intact even after appetitive memory was extinguished. Extinction decreased the CS+ response of MVP2/V2, establishing a parallel memory trace. C, D, Aversive conditioning led to a reduced CS+ input into MVP2/V2, but not into MV2/M6. After memory extinction, the memory trace in MVP2/V2 remained, and we observed an additional decrease in CS+ response in MV2/M6. Boxplots show the median and the lower and upper quartiles, whiskers indicate 1.5 times interquartile range, outliers are marked with + symbol; n.s. = not significant, **p < 0.01. Results across simulation of n = 10 networks in all panels.

  • Figure 6.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 6.

    Blocking specific neuron groups can reverse memory extinction. The computational model was used to reproduce the neurogenetic manipulations with temperature-sensitive shibirets1 mutant flies. The activation of individual neurons was suppressed during extinction training only. This allows to identify neurons that are crucial to the process of establishing the extinction memory in our model. A, Knocking down PPL1 or V2 during extinction training led to a completely abolished the appetitive extinction memory. Blocking PAM, M6, MV2, or MVP2 neurons during odor re-exposure had no effect on the appetitive extinction memory. Blockage of all KCs (100%) prevented the model from extinction learning. However, when only 50% of the KC population was blocked during odor reactivation, extinction learning remained unaffected. Further effects of partial KC blocking on extinction learning are presented in Extended Data Figure 6-1. B, Suppressing the activity of PAM DANs, M6 neurons, or KCs (100%) during CS+ odor re-exposure after aversive conditioning prevents the model from forming the aversive extinction memory. However, blocking PPL1 DANs, MV2, V2, MVP2, or KCs (50%) does not diminish the memory extinction. Data are presented as mean ± SD; n.s. = not significant, **p < 0.01, ***p < 0.001; n = 15.

Extended Data

  • Figures
  • Extended Data Figure 3-1

    Comparison between experimental and model results. The experimental data on appetitive and aversive conditioning is extracted from Felsenberg et al. (2017) and Felsenberg et al. (2018), respectively. For statistical analysis of the simulated data see Methods. Experimental values are shown as mean ± SEM, model values are shown as mean ±STD. *The behavioral data include two control groups, the first carried the GAL4 gene only, the second the UAS gene only. The test group carrying both, the GAL4 and the UAS gene, showed cell-specific expression of the temperature-sensitive UAS-Shibirets1 transgene Download Figure 3-1, EPS file.

  • Extended Data Figure 3-2

    Independent tuning of the model parameters for the appetitive and aversive pathways allows to match the experimental results more accurately. In order to fit the model to the asymmetry in the experimental results, the PPL1 output was tuned towards a lower activation threshold with the sigmoid transfer function PPL1=1(1+10000×e−PPL1Input×21) . In consequence, lateral inhibition between MBONs was appropriately adjusted: MVP2::M6 inhibition was increased to M6–=−0.6(1+200×e(−MVP2×16)) , MV2::V2 inhibition was decreased to V2−=−0.6(1+200×e(−MV2×13)) . Further, the factor ρ representing the inhibition of the PAM DAN was set to 0.5 for the aversive learning conditioning paradigm. With the tuned model we can achieve a stronger aversive memory (B, D) that fits quantitatively the experimental results (Table 1). The tuned model also achieves a stronger extinction effect in the appetitive learning paradigm (A, C); however, it does not abolish appetitive memory completely. Boxplots show the median and the lower and upper quartiles, whiskers indicate 1.5 times interquartile range, outliers are marked with + symbol. Bar plots are presented as mean ± SD; n = 10 independent models Download Figure 3-2, TIF file.

  • Extended Data Figure 6-1

    The percentage of blocked KCs during CS+ odor reactivation determines the effect on extinction memory formation. When all KCs are blocked during appetitive or aversive extinction training, no extinction memory is formed. However, reducing the percentage of blocked KCs gradually increases the ability to establish extinction memory. When 50% or less KCs are blocked, memory extinction formation is not impaired. Data is presented as the mean ± SD; n = 15. Download Figure 6-1, EPS file.

Back to top

In this issue

eneuro: 8 (3)
eNeuro
Vol. 8, Issue 3
May/June 2021
  • Table of Contents
  • Index by author
  • Ed Board (PDF)
Email

Thank you for sharing this eNeuro article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
A Mechanistic Model for Reward Prediction and Extinction Learning in the Fruit Fly
(Your Name) has forwarded a page to you from eNeuro
(Your Name) thought you would be interested in this article in eNeuro.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
A Mechanistic Model for Reward Prediction and Extinction Learning in the Fruit Fly
Magdalena Springer, Martin Paul Nawrot
eNeuro 30 March 2021, 8 (3) ENEURO.0549-20.2021; DOI: 10.1523/ENEURO.0549-20.2021

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Share
A Mechanistic Model for Reward Prediction and Extinction Learning in the Fruit Fly
Magdalena Springer, Martin Paul Nawrot
eNeuro 30 March 2021, 8 (3) ENEURO.0549-20.2021; DOI: 10.1523/ENEURO.0549-20.2021
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Visual Abstract
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Acknowledgments
    • Footnotes
    • References
    • Synthesis
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • Drosophila melanogaster
  • memory extinction
  • reinforcement learning
  • reward prediction
  • single-trial learning

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Article: New Research

  • Novel roles for the GPI-anchor cleaving enzyme, GDE2, in hippocampal synaptic morphology and function
  • Upright posture: a singular condition stabilizing sensorimotor coordination
  • Cerebellum involvement in visuo-vestibular interaction for the perception of gravitational direction: a repetitive transcranial magnetic stimulation study
Show more Research Article: New Research

Sensory and Motor Systems

  • Novel roles for the GPI-anchor cleaving enzyme, GDE2, in hippocampal synaptic morphology and function
  • Upright posture: a singular condition stabilizing sensorimotor coordination
  • Cerebellum involvement in visuo-vestibular interaction for the perception of gravitational direction: a repetitive transcranial magnetic stimulation study
Show more Sensory and Motor Systems

Subjects

  • Sensory and Motor Systems
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Latest Articles
  • Issue Archive
  • Blog
  • Browse by Topic

Information

  • For Authors
  • For the Media

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Feedback
(eNeuro logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
eNeuro eISSN: 2373-2822

The ideas and opinions expressed in eNeuro do not necessarily reflect those of SfN or the eNeuro Editorial Board. Publication of an advertisement or other product mention in eNeuro should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in eNeuro.