Figure 1. Stimuli and simulation results. A, The stimuli were the same in the simulations and in the human experiments. The items were displayed at opposite sides of the screen (either 45° and 225° or −45° and −225°). Both item positions were jittered by a random amount in both the x- and y-axes (Δx and Δy in the picture) to make the task non-trivial for human participants (i.e., preventing participants from performing the SR task considering only the position of one item, thus ignoring the SR between the two items). The items used are hexominoes (right panel). Minimum and maximum item height and width are 1.2–3.6° and 1.2–2.7° of visual angle, respectively, and 2–5 pixels used for the simulations (image size was 50 × 80 pixels). B, Example of stimuli position for the SD task (left column) and spatial relation task (SR, right column). For the sake of illustration, the ratio between the screen and hexominoes size has been modified (stimuli here look bigger than in the real experiment). C, D, Accuracy of the CNN network on the SD (light red) and SR (blue) tasks, and of a Siamese network trained on the SD task (dark red). The Siamese network mimics segmentation in a feedforward network, by separating the items in two distinct channels of the network (see D). The left panel shows the training curves for each network (accuracy over epochs during training); we stopped the training when the validation accuracy reached 90%. In the right panel, we show the training accuracy at the last epoch and the test accuracy. The latter was evaluated using novel items never used for training, and it reveals that the CNN seems to only learn the required rule for the SR but not for the SD task, as shown in a previous study. Conversely, the Siamese network (CNN with segmentation) can solve the SD task, demonstrating that segmentation can allow the CNN to successfully accomplish this task. In both panels we show average values ± SE over 10 repetitions using different random initializations.