Modelling individual differences in the form of Pavlovian conditioned approach responses: a dual learning systems approach with factored representations

Florian Lesaint; Olivier Sigaud; Shelly B Flagel; Terry E Robinson; Mehdi Khamassi

doi:10.1371/journal.pcbi.1003466

Modelling individual differences in the form of Pavlovian conditioned approach responses: a dual learning systems approach with factored representations

PLoS Comput Biol. 2014 Feb 13;10(2):e1003466. doi: 10.1371/journal.pcbi.1003466. eCollection 2014 Feb.

Authors

Florian Lesaint¹, Olivier Sigaud¹, Shelly B Flagel², Terry E Robinson³, Mehdi Khamassi¹

Affiliations

¹ Institut des Systèmes Intelligents et de Robotique, UMR 7222, UPMC Univ Paris 06, Paris, France ; Institut des Systèmes Intelligents et de Robotique, UMR 7222, CNRS, Paris, France.
² Department of Psychiatry, University of Michigan, Ann Arbor, Michigan, United States of America ; Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, Michigan, United States of America ; Department of Psychology, University of Michigan, Ann Arbor, Michigan, United States of America.
³ Department of Psychology, University of Michigan, Ann Arbor, Michigan, United States of America.

Abstract

Reinforcement Learning has greatly influenced models of conditioning, providing powerful explanations of acquired behaviour and underlying physiological observations. However, in recent autoshaping experiments in rats, variation in the form of Pavlovian conditioned responses (CRs) and associated dopamine activity, have questioned the classical hypothesis that phasic dopamine activity corresponds to a reward prediction error-like signal arising from a classical Model-Free system, necessary for Pavlovian conditioning. Over the course of Pavlovian conditioning using food as the unconditioned stimulus (US), some rats (sign-trackers) come to approach and engage the conditioned stimulus (CS) itself - a lever - more and more avidly, whereas other rats (goal-trackers) learn to approach the location of food delivery upon CS presentation. Importantly, although both sign-trackers and goal-trackers learn the CS-US association equally well, only in sign-trackers does phasic dopamine activity show classical reward prediction error-like bursts. Furthermore, neither the acquisition nor the expression of a goal-tracking CR is dopamine-dependent. Here we present a computational model that can account for such individual variations. We show that a combination of a Model-Based system and a revised Model-Free system can account for the development of distinct CRs in rats. Moreover, we show that revising a classical Model-Free system to individually process stimuli by using factored representations can explain why classical dopaminergic patterns may be observed for some rats and not for others depending on the CR they develop. In addition, the model can account for other behavioural and pharmacological results obtained using the same, or similar, autoshaping procedures. Finally, the model makes it possible to draw a set of experimental predictions that may be verified in a modified experimental protocol. We suggest that further investigation of factored representations in computational neuroscience studies may be useful.

MeSH terms

Algorithms
Animals
Behavior, Animal / drug effects
Behavior, Animal / physiology
Brain / physiology
Computational Biology
Conditioning, Psychological* / drug effects
Conditioning, Psychological* / physiology
Dopamine / physiology
Dopamine Antagonists / administration & dosage
Flupenthixol / administration & dosage
Models, Neurological
Models, Psychological*
Rats
Reinforcement, Psychology
Systems Biology

Substances

Dopamine Antagonists
Flupenthixol
Dopamine

Grants and funding

P01 DA031656/DA/NIDA NIH HHS/United States