Unique features of stimulus-based probabilistic reversal learning

Carl Harris; Claudia Aguirre; Saisriya Kolli; Kanak Das; Alicia Izquierdo; Alireza Soltani

doi:10.1101/2020.09.24.310771

Abstract

Reversal learning paradigms are widely-used assays of behavioral flexibility with their probabilistic versions being more amenable to studying integration of reward outcomes over time. Prior research suggests differences between initial and reversal learning, including higher learning rates, a greater need for inhibitory control, and more perseveration after reversals. However, it is not well-understood what aspects of stimulus-based reversal learning are unique to reversals, and whether and how observed differences depend on reward probability. Here, we used a visual probabilistic discrimination and reversal learning paradigm where male and female rats selected between a pair of stimuli associated with different reward probabilities. We compared accuracy, rewards collected, omissions, latencies, win-stay/lose-shift strategies, and indices of perseveration across two different reward probability schedules. We found that discrimination and reversal learning are behaviorally more unique than similar: latencies to select the better option, win-stay strategies, and perseveration were of a different pattern in reversal learning compared to discrimination learning. Additionally, fit of choice behavior using reinforcement learning models revealed a lower sensitivity to the difference in subjective reward values (greater exploration) and higher learning rates for the reversal phase. Interestingly, a consistent reward probability group difference emerged with a richer environment associated with longer reward collection latencies than a leaner environment. We also replicated previous reports on sex differences in reversal learning. Future studies should systematically compare the neural correlates of fine-grained behavioral measures to reveal possible dissociations in how the circuitry is recruited in each phase.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

↵# co-first authors
↵* co-senior authors
Funding This work was supported by R01 DA047870 (Soltani and Izquierdo). We acknowledge UCLA’s Graduate Division Graduate Summer Research Mentorship and Graduate Research Mentorship programs (Aguirre) and the NSF Graduate Research Fellowship (Aguirre).
Disclosure There is no conflict of interest or need for disclosure.