Autoshaped choice in artificial neural networks: Implications for behavioral economics and neuroeconomics
Introduction
This paper describes the first use of an existing neural network model of conditioning to simulate a phenomenon that is relevant to behavioral economics, the study of the environmental factors that affect economic behavior defined as the allocation of actions across alternatives according to their values. The phenomenon can be called “autoshaped choice” and was first reported by Picker and Poling (1982).
In their basic procedure, they used an autoshaping arrangement for baseline training, where pigeons received randomly intermixed trials of two keylight colors X and Y (e.g., red and yellow), separately paired with food (grain access) in a forward-delay manner, independently of responding. Fifty-percent of X trials and 100% of Y trials was paired with food. Then, pigeons received choice tests where X and Y were presented concurrently and unreinforced. Pigeons tended to choose Y more frequently.
Most features of this phenomenon fit well with behavior-analytic approaches to behavioral economics (e.g., Green and Rachlin, 1975, Hursh, 1980, Hursh, 1984, Kagel et al., 1975, Kagel et al., 1981, Kagel and Winkler, 1972, Lea, 1978). To begin with, the phenomenon involves key pecking, a prototypical emitted, skeletal response widely used in these approaches. This response is emitted in that it is not unconditionally elicited by the reward, at least as demonstrably as typical unconditional responses (e.g., nictitating-membrane responses unconditionally elicited by air puffs or electric shocks to a rabbit's eye). Such approaches view economic behavior as emitted in this sense, rather than unconditionally elicited.
Also, autoshaped choice involves the use of food as a reward. In behavior-analytic approaches to economic behavior, food is analogs to a good or commodity that satisfies some need when consumed. This good occurs more or less frequently, depending on certain conditions. This feature is also defining of autoshaped choice: Food occurs more frequently for one response alternative than another. And choice has been central to behavioral economics as well. In fact, economic behavior is conceived of as a kind of choice behavior.
This feature has encouraged the use of the sort of mathematical models that are used in behavioral studies of choice (e.g., hyperbolic discounting). The use of a neural network model here echoes this use of mathematical models, although, as will become apparent later on, the present model differs from others. In any case, those models describe a phenomenon that has been widely studied in behavioral economics: Preference for a response alternative that is more valuable in virtue of its correlation with a more frequent occurrence of a good. This preference is also observed in autoshaped choice and, as will be shown, is also predicted by the model used here.
Autoshaped choice can thus reasonably be viewed as a form of economic behavior, or at least a simple one. Still, autoshaped choice has a feature that fits less well with behavior-analytic approaches to behavioral economics: The use of Pavlovian contingencies, where the reward is paired with discrete trials of exteroceptive stimuli independently of responding. Behavior-analytic approaches to economic behavior, in contrast, have emphasized operant contingencies, where reinforcement is explicitly scheduled to partly depend on responding and not signaled by discrete trials.
Moreover, in such approaches, the acquisition and maintenance of economic behavior are supposed to require operant contingencies. However, the discovery of autoshaping (Brown and Jenkins, 1968) and automaintenance (Schwartz and Williams, 1972, Williams and Williams, 1969) showed the possibility that emitted responding can be acquired and maintained only through Pavlovian contingencies, without operant contingencies. Despite this, such a possibility has not yet been considered for economic behavior. The present study is the first one to do so.
McSweeney and Bierley (1984) discussed autoshaping in relation to consumer behavior. However, they did not elaborate exactly how autoshaping contributed to economic behavior. Nor did they discuss autoshaped choice. Still, autoshaped choice provides the strongest link thus far between autoshaping and economic behavior. This link implies that the acquisition and maintenance of a simple form of economic behavior do not require operant contingencies. Of course, this does not mean that operant contingencies do not influence the economic behavior. They certainly do. However, autoshaped choice implies that they are not necessary for the acquisition and maintenance of a simple form of economic behavior.
There have been discussions about the influence of Pavlovian contingencies on economic behavior (e.g., Camerer, 2011, Daw and O’Doherty, 2014, Daw and Tobler, 2014, Seymour and Dolan, 2008). However, they still assume that operant contingencies are necessary for the acquisition and maintenance of economic behavior. They have not considered the possibility that economic behavior can be acquired through Pavlovian contingencies alone, without operant contingencies. Autoshaping choice precisely shows this possibility.
The relevance to neuroeconomics is that the present model takes into account brain structures that have been propounded as neural substrates of economic behavior, especially dopaminergic and hippocampal systems (e.g., Camerer et al., 2005, Daw and O’Doherty, 2014, Daw and Tobler, 2014, Dayan and Seymour, 2009, Salamone et al., 2009, Seymour and Dolan, 2008). The model hypothesizes that such systems also underlie autoshaped choice. More specific implications for neuroeconomics will become apparent later. Suffice it to say for now that the model is the first to theoretically link autoshaping, behavioral economics, and neuroeconomics.
The next section gives a general description of the model. Section 3 describes a simulation of autoshaped choice with the model. The paper ends with concluding remarks about some implications for behavioral economics and neuroeconomics, and a philosophical (methodological) issue about explanation in neural-network modeling.
Section snippets
The model: general description
The model has a computational part and a network part. The computational part consists of the activation and learning rules. The network part is a classification of the types of units that can constitute a neural network and guidelines on how to connect them, according to basic general principles of gross vertebrate neuroanatomy. This section focuses on describing the network part in terms of the network architecture that was used for the simulation. The computational part is described in the
A simulation
A simulator designed and coded by the first author was used for the simulation. Nine naive networks with the architecture shown in Fig. 1 received a training protocol that simulated a simplified version of the procedure that Picker and Poling (1982) used in their experiment 1. All networks were naive in that their initial connection weights were low (0.2 for the S'–S″ and S″–H connections, 0.01 for the rest; see Fig. 1).
All networks first received 200 X trials randomly interspersed with 200 Y
General discussion
The results show that the model can simulate autoshaped choice, a simple form of economic behavior, by exploiting the distributed and parallel character of connectionist systems. The distributed character was exploited with an architectural differentiation between two network pathways (S′1–S″1–M″1–M′1 vs. S′2–S″2–M″2–M″2 in Fig. 1) that are separately affected by two environment-behavior relations (X-R1 vs. Y-R2, respectively). Such differentiation allowed the baseline training to promote
Acknowledgments
We thank Lewis Bizo for his invitation to give the talk that resulted in this paper at the 37th Annual Meeting of the Society for Quantitative Analyses of Behavior, and to submit the paper to this issue. We also thank two anonymous reviewers for useful comments to a previous draft. The executable file of the simulator, the necessary files to run the simulations, the source code, and simulation data are available upon request to the first author.
References (56)
Spontaneous decisions and operant conditioning in fruit flies
Behav. Process.
(2011)Theoretical note: simulating latent inhibition with selection neural networks
Behav. Process.
(2003)Theoretical note: the C/T ratio in artificial neural networks
Behav. Process.
(2005)- et al.
A simultaneous procedure facilitates acquisition under an optimal interstimulus interval in artificial neural networks and rats
Behav. Process.
(2008) - et al.
Neural-network simulations of two context-dependence phenomena
Behav. Process.
(2007) - et al.
Pavlovian conditioning: pigeon nictitating membrane
Behav. Process.
(2011) - et al.
Value learning through reinforcement: The basics of dopamine and reinforcement learning
- et al.
Values and actions in aversion
Models of trace decay, eligibility for reinforcement, and delay of reinforcement gradients, from exponential to hyperboloid
Behav. Process.
(2011)- et al.
Some structural determinants of Pavlovian conditioning in artificial neural networks
Behav. Process.
(2010)